diff mbox series

[2/2] x86/unwind: add hardcoded ORC entry for NULL

Message ID 20190301031201.7416-2-jannh@google.com (mailing list archive)
State New, archived
Headers show
Series [1/2] x86/unwind: handle NULL pointer calls better in frame unwinder | expand

Commit Message

Jann Horn March 1, 2019, 3:12 a.m. UTC
When the ORC unwinder is invoked for an oops caused by IP==0,
it currently has no idea what to do because there is no debug information
for the stack frame of NULL.

But if RIP is NULL, it is very likely that the last successfully executed
instruction was an indirect CALL/JMP, and it is possible to unwind out in
the same way as for the first instruction of a normal function. Hardcode
a corresponding ORC entry.


With an artificially-added NULL call in prctl_set_seccomp(), before this
patch, the trace is:

Call Trace:
 ? __x64_sys_prctl+0x402/0x680
 ? __ia32_sys_prctl+0x6e0/0x6e0
 ? __do_page_fault+0x457/0x620
 ? do_syscall_64+0x6d/0x160
 ? entry_SYSCALL_64_after_hwframe+0x44/0xa9

After this patch, the trace looks like this:

Call Trace:
 __x64_sys_prctl+0x402/0x680
 ? __ia32_sys_prctl+0x6e0/0x6e0
 ? __do_page_fault+0x457/0x620
 do_syscall_64+0x6d/0x160
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

prctl_set_seccomp() still doesn't show up in the trace because for some
reason, tail call optimization is only disabled in builds that use the
frame pointer unwinder.

Signed-off-by: Jann Horn <jannh@google.com>
---
Is there a reason why the top-level Makefile only sets
-fno-optimize-sibling-calls if CONFIG_FRAME_POINTER is set?
I suspect that this is just a historical thing, because reliable
unwinding didn't work without frame pointers until ORC came along.
I'm not quite sure how best to express "don't do tail optimization if
either frame pointers are used or ORC is used" in a Makefile, and
whether we want an indirection through Kconfig for that, so I'm not
doing anything about it in this series.
Can someone send a patch to deal with it properly?


 arch/x86/kernel/unwind_orc.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

Comments

Josh Poimboeuf March 1, 2019, 4:24 p.m. UTC | #1
On Fri, Mar 01, 2019 at 04:12:01AM +0100, Jann Horn wrote:
> When the ORC unwinder is invoked for an oops caused by IP==0,
> it currently has no idea what to do because there is no debug information
> for the stack frame of NULL.
> 
> But if RIP is NULL, it is very likely that the last successfully executed
> instruction was an indirect CALL/JMP, and it is possible to unwind out in
> the same way as for the first instruction of a normal function. Hardcode
> a corresponding ORC entry.
> 
> 
> With an artificially-added NULL call in prctl_set_seccomp(), before this
> patch, the trace is:
> 
> Call Trace:
>  ? __x64_sys_prctl+0x402/0x680
>  ? __ia32_sys_prctl+0x6e0/0x6e0
>  ? __do_page_fault+0x457/0x620
>  ? do_syscall_64+0x6d/0x160
>  ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> After this patch, the trace looks like this:
> 
> Call Trace:
>  __x64_sys_prctl+0x402/0x680
>  ? __ia32_sys_prctl+0x6e0/0x6e0
>  ? __do_page_fault+0x457/0x620
>  do_syscall_64+0x6d/0x160
>  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> prctl_set_seccomp() still doesn't show up in the trace because for some
> reason, tail call optimization is only disabled in builds that use the
> frame pointer unwinder.
> 
> Signed-off-by: Jann Horn <jannh@google.com>

Thanks for the patches!

Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>

> Is there a reason why the top-level Makefile only sets
> -fno-optimize-sibling-calls if CONFIG_FRAME_POINTER is set?
> I suspect that this is just a historical thing, because reliable
> unwinding didn't work without frame pointers until ORC came along.
> I'm not quite sure how best to express "don't do tail optimization if
> either frame pointers are used or ORC is used" in a Makefile, and
> whether we want an indirection through Kconfig for that, so I'm not
> doing anything about it in this series.
> Can someone send a patch to deal with it properly?

Why would sibling calls be a problem for ORC?  Once a function does a
sibling call, it has effectively returned and shouldn't show up on the
stack trace anyway.
Josh Poimboeuf March 1, 2019, 4:29 p.m. UTC | #2
On Fri, Mar 01, 2019 at 10:24:18AM -0600, Josh Poimboeuf wrote:
> > Is there a reason why the top-level Makefile only sets
> > -fno-optimize-sibling-calls if CONFIG_FRAME_POINTER is set?
> > I suspect that this is just a historical thing, because reliable
> > unwinding didn't work without frame pointers until ORC came along.
> > I'm not quite sure how best to express "don't do tail optimization if
> > either frame pointers are used or ORC is used" in a Makefile, and
> > whether we want an indirection through Kconfig for that, so I'm not
> > doing anything about it in this series.
> > Can someone send a patch to deal with it properly?
> 
> Why would sibling calls be a problem for ORC?  Once a function does a
> sibling call, it has effectively returned and shouldn't show up on the
> stack trace anyway.

Answering my own question, I guess some people might find it confusing
to have a caller skipped in the stack trace.  But nobody has ever
complained about it.

It's not a problem for livepatch since we only care about the return
path.
Jann Horn March 1, 2019, 4:55 p.m. UTC | #3
On Fri, Mar 1, 2019 at 5:29 PM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
>
> On Fri, Mar 01, 2019 at 10:24:18AM -0600, Josh Poimboeuf wrote:
> > > Is there a reason why the top-level Makefile only sets
> > > -fno-optimize-sibling-calls if CONFIG_FRAME_POINTER is set?
> > > I suspect that this is just a historical thing, because reliable
> > > unwinding didn't work without frame pointers until ORC came along.
> > > I'm not quite sure how best to express "don't do tail optimization if
> > > either frame pointers are used or ORC is used" in a Makefile, and
> > > whether we want an indirection through Kconfig for that, so I'm not
> > > doing anything about it in this series.
> > > Can someone send a patch to deal with it properly?
> >
> > Why would sibling calls be a problem for ORC?  Once a function does a
> > sibling call, it has effectively returned and shouldn't show up on the
> > stack trace anyway.
>
> Answering my own question, I guess some people might find it confusing
> to have a caller skipped in the stack trace.  But nobody has ever
> complained about it.
>
> It's not a problem for livepatch since we only care about the return
> path.

Yeah, that's my concern. I understand that it's irrelevant for tooling
that wants to understand what context a function is running in, but it
might matter to a human trying to understand how a function was
reached. In a theoretical worst case, a stack trace might skip
directly from do_syscall_64() into some random helper function that
received a bad pointer, and that might make debugging harder.
diff mbox series

Patch

diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
index 26038eacf74a..89be1be1790c 100644
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -113,6 +113,20 @@  static struct orc_entry *orc_ftrace_find(unsigned long ip)
 }
 #endif
 
+/*
+ * If we crash with IP==0, the last successfully executed instruction
+ * was probably an indirect function call with a NULL function pointer,
+ * and we don't have unwind information for NULL.
+ * This hardcoded ORC entry for IP==0 allows us to unwind from a NULL function
+ * pointer into its parent and then continue normally from there.
+ */
+static struct orc_entry null_orc_entry = {
+	.sp_offset = sizeof(long),
+	.sp_reg = ORC_REG_SP,
+	.bp_reg = ORC_REG_UNDEFINED,
+	.type = ORC_TYPE_CALL
+};
+
 static struct orc_entry *orc_find(unsigned long ip)
 {
 	static struct orc_entry *orc;
@@ -120,6 +134,9 @@  static struct orc_entry *orc_find(unsigned long ip)
 	if (!orc_init)
 		return NULL;
 
+	if (ip == 0)
+		return &null_orc_entry;
+
 	/* For non-init vmlinux addresses, use the fast lookup table: */
 	if (ip >= LOOKUP_START_IP && ip < LOOKUP_STOP_IP) {
 		unsigned int idx, start, stop;