[v4,3/3] arm64: reliable stacktraces
diff mbox series

Message ID 20181026142157.B8FAA68C97@newverein.lst.de
State New
Headers show
Series
  • arm64 live patching
Related show

Commit Message

Torsten Duwe Oct. 26, 2018, 2:21 p.m. UTC
Enhance the stack unwinder so that it reports whether it had to stop
normally or due to an error condition; unwind_frame() will report
continue/error/normal ending and walk_stackframe() will pass that
info. __save_stack_trace() is used to check the validity of a stack;
save_stack_trace_tsk_reliable() can now trivially be implemented.
Modify arch/arm64/kernel/time.c as the only external caller so far
to recognise the new semantics.

I had to introduce a marker symbol kthread_return_to_user to tell
the normal origin of a kernel thread.

Signed-off-by: Torsten Duwe <duwe@suse.de>

Comments

Josh Poimboeuf Oct. 26, 2018, 3:37 p.m. UTC | #1
On Fri, Oct 26, 2018 at 04:21:57PM +0200, Torsten Duwe wrote:
> Enhance the stack unwinder so that it reports whether it had to stop
> normally or due to an error condition; unwind_frame() will report
> continue/error/normal ending and walk_stackframe() will pass that
> info. __save_stack_trace() is used to check the validity of a stack;
> save_stack_trace_tsk_reliable() can now trivially be implemented.
> Modify arch/arm64/kernel/time.c as the only external caller so far
> to recognise the new semantics.
> 
> I had to introduce a marker symbol kthread_return_to_user to tell
> the normal origin of a kernel thread.
> 
> Signed-off-by: Torsten Duwe <duwe@suse.de>

I haven't looked at the code, but the commit log doesn't inspire much
confidence.  It's missing everything I previously asked for in the
powerpc version.

There's zero mention of objtool.  What analysis was done to indicate
that we can rely on frame pointers?

Such a frame pointer analysis should be included in the commit log.  It
should describe *at least* the following:

- whether inline asm statements with call/branch instructions will
  confuse GCC into skipping the frame pointer setup if it considers the
  function to be a leaf function;

- whether hand-coded non-leaf assembly functions can accidentally omit
  the frame pointer prologue setup;

- whether GCC can generally be relied upon to get arm64 frame pointers
  right, in both normal operation and edge cases.


The commit log should also describe whether the unwinder itself can be
considered reliable for all edge cases:

- detection and reporting of preemption and page faults;

- detection and recovery from function graph tracing;

- detection and reporting of other unexpected conditions,
  including when the unwinder doesn't reach the end of the stack.
Mark Rutland Oct. 29, 2018, 9:28 a.m. UTC | #2
Hi Josh,

I also have a few concerns here, as it is not clear to me precisely what is
required from arch code. Is there any documentation I should look at?

On Fri, Oct 26, 2018 at 10:37:04AM -0500, Josh Poimboeuf wrote:
> On Fri, Oct 26, 2018 at 04:21:57PM +0200, Torsten Duwe wrote:
> > Enhance the stack unwinder so that it reports whether it had to stop
> > normally or due to an error condition; unwind_frame() will report
> > continue/error/normal ending and walk_stackframe() will pass that
> > info. __save_stack_trace() is used to check the validity of a stack;
> > save_stack_trace_tsk_reliable() can now trivially be implemented.
> > Modify arch/arm64/kernel/time.c as the only external caller so far
> > to recognise the new semantics.

There are a number of error conditions not currently handled by the unwinder
(mostly in the face of stack corruption), for which there have been prior
discussions on list.

Do we care about those cases, or do we consider things best-effort in the face
of stack corruption?

> > I had to introduce a marker symbol kthread_return_to_user to tell
> > the normal origin of a kernel thread.
> > 
> > Signed-off-by: Torsten Duwe <duwe@suse.de>
> 
> I haven't looked at the code, but the commit log doesn't inspire much
> confidence.  It's missing everything I previously asked for in the
> powerpc version.
> 
> There's zero mention of objtool.  What analysis was done to indicate
> that we can rely on frame pointers?
> 
> Such a frame pointer analysis should be included in the commit log.  It
> should describe *at least* the following:
> 
> - whether inline asm statements with call/branch instructions will
>   confuse GCC into skipping the frame pointer setup if it considers the
>   function to be a leaf function;

There's a reasonable chance that the out-of-line LL/SC atomics could confuse
GCC into thinking callers are leaf functions. That's the only inline asm that
I'm aware of with BL instructions (how calls are made on arm64).

> - whether hand-coded non-leaf assembly functions can accidentally omit
>   the frame pointer prologue setup;

Most of our assembly doesn't setup stackframes, and some of these are non-leaf,
e.g. __cpu_suspend_enter.

Also, I suspect our entry assembly may violate/confuse assumptions here. I've
been working to move more of that to C, but that isn't yet complete.

> - whether GCC can generally be relied upon to get arm64 frame pointers
>   right, in both normal operation and edge cases.
> 
> The commit log should also describe whether the unwinder itself can be
> considered reliable for all edge cases:
> 
> - detection and reporting of preemption and page faults;
> 
> - detection and recovery from function graph tracing;
> 
> - detection and reporting of other unexpected conditions,
>   including when the unwinder doesn't reach the end of the stack.

We may also have NMIs (with SDEI).

Thanks,
Mark.
Josh Poimboeuf Oct. 29, 2018, 3:42 p.m. UTC | #3
On Mon, Oct 29, 2018 at 09:28:12AM +0000, Mark Rutland wrote:
> Hi Josh,
> 
> I also have a few concerns here, as it is not clear to me precisely what is
> required from arch code. Is there any documentation I should look at?

The short answer is that we need:

1) Reliable frame pointers --  on x86 we do that with objtool:

   tools/objtool/Documentation/stack-validation.txt

2) Reliable unwinder -- on x86 we had to rewrite the unwinder.  There's
   no documentation but the code is simple enough.  See
   unwind_next_frame() in arch/x86/kernel/unwind_frame.c  and
   __save_stack_trace_reliable() in arch/x86/kernel/stacktrace.c.

> On Fri, Oct 26, 2018 at 10:37:04AM -0500, Josh Poimboeuf wrote:
> > On Fri, Oct 26, 2018 at 04:21:57PM +0200, Torsten Duwe wrote:
> > > Enhance the stack unwinder so that it reports whether it had to stop
> > > normally or due to an error condition; unwind_frame() will report
> > > continue/error/normal ending and walk_stackframe() will pass that
> > > info. __save_stack_trace() is used to check the validity of a stack;
> > > save_stack_trace_tsk_reliable() can now trivially be implemented.
> > > Modify arch/arm64/kernel/time.c as the only external caller so far
> > > to recognise the new semantics.
> 
> There are a number of error conditions not currently handled by the unwinder
> (mostly in the face of stack corruption), for which there have been prior
> discussions on list.
> 
> Do we care about those cases, or do we consider things best-effort in the face
> of stack corruption?

The unwinder needs to be able to detect all stack corruption and return
an error.

[ But note that we don't need to worry about unwinding a task's stack
  while the task is running, which can be a common source of
  "corruption".  For livepatch we make sure every task is blocked
  (except when checking the current task). ]

It also needs to:

- detect preemption / page fault frames and return an error

- only return success if it reaches the end of the task stack; for user
  tasks, that means the syscall barrier; for kthreads/idle tasks, that
  means finding a defined thread entry point

- make sure it can't get into a recursive loop

- make sure each return address is a valid text address

- properly detect generated code hacks like function graph tracing and
  kretprobes


> > > I had to introduce a marker symbol kthread_return_to_user to tell
> > > the normal origin of a kernel thread.
> > > 
> > > Signed-off-by: Torsten Duwe <duwe@suse.de>
> > 
> > I haven't looked at the code, but the commit log doesn't inspire much
> > confidence.  It's missing everything I previously asked for in the
> > powerpc version.
> > 
> > There's zero mention of objtool.  What analysis was done to indicate
> > that we can rely on frame pointers?
> > 
> > Such a frame pointer analysis should be included in the commit log.  It
> > should describe *at least* the following:
> > 
> > - whether inline asm statements with call/branch instructions will
> >   confuse GCC into skipping the frame pointer setup if it considers the
> >   function to be a leaf function;
> 
> There's a reasonable chance that the out-of-line LL/SC atomics could confuse
> GCC into thinking callers are leaf functions. That's the only inline asm that
> I'm aware of with BL instructions (how calls are made on arm64).
>
> > - whether hand-coded non-leaf assembly functions can accidentally omit
> >   the frame pointer prologue setup;
> 
> Most of our assembly doesn't setup stackframes, and some of these are non-leaf,
> e.g. __cpu_suspend_enter.
> 
> Also, I suspect our entry assembly may violate/confuse assumptions here. I've
> been working to move more of that to C, but that isn't yet complete.

My experience with arm64 is very limited, but it sounds like it has some
of the same issues as x86.  In which case we may need to port objtool to
arm64.

> > - whether GCC can generally be relied upon to get arm64 frame pointers
> >   right, in both normal operation and edge cases.
> > 
> > The commit log should also describe whether the unwinder itself can be
> > considered reliable for all edge cases:
> > 
> > - detection and reporting of preemption and page faults;
> > 
> > - detection and recovery from function graph tracing;
> > 
> > - detection and reporting of other unexpected conditions,
> >   including when the unwinder doesn't reach the end of the stack.
> 
> We may also have NMIs (with SDEI).

NMIs shouldn't be an issue because livepatch only unwinds blocked tasks.

Patch
diff mbox series

--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -128,8 +128,9 @@  config ARM64
 	select HAVE_PERF_EVENTS
 	select HAVE_PERF_REGS
 	select HAVE_PERF_USER_STACK_DUMP
-	select HAVE_REGS_AND_STACK_ACCESS_API
 	select HAVE_RCU_TABLE_FREE
+	select HAVE_REGS_AND_STACK_ACCESS_API
+	select HAVE_RELIABLE_STACKTRACE
 	select HAVE_STACKPROTECTOR
 	select HAVE_SYSCALL_TRACEPOINTS
 	select HAVE_KPROBES
--- a/arch/arm64/include/asm/stacktrace.h
+++ b/arch/arm64/include/asm/stacktrace.h
@@ -33,7 +33,7 @@  struct stackframe {
 };
 
 extern int unwind_frame(struct task_struct *tsk, struct stackframe *frame);
-extern void walk_stackframe(struct task_struct *tsk, struct stackframe *frame,
+extern int walk_stackframe(struct task_struct *tsk, struct stackframe *frame,
 			    int (*fn)(struct stackframe *, void *), void *data);
 extern void dump_backtrace(struct pt_regs *regs, struct task_struct *tsk);
 
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -40,6 +40,16 @@ 
  *	ldp	x29, x30, [sp]
  *	add	sp, sp, #0x10
  */
+
+/* The bottom of kernel thread stacks points there */
+extern void *kthread_return_to_user;
+
+/*
+ * unwind_frame -- unwind a single stack frame.
+ * Returns 0 when there are more frames to go.
+ * 1 means reached end of stack; negative (error)
+ * means stopped because information is not reliable.
+ */
 int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
 {
 	unsigned long fp = frame->fp;
@@ -75,29 +85,39 @@  int notrace unwind_frame(struct task_str
 #endif /* CONFIG_FUNCTION_GRAPH_TRACER */
 
 	/*
+	 * kthreads created via copy_thread() (called from kthread_create())
+	 * will have a zero BP and a return value into ret_from_fork.
+	 */
+	if (!frame->fp && frame->pc == (unsigned long)&kthread_return_to_user)
+		return 1;
+	/*
 	 * Frames created upon entry from EL0 have NULL FP and PC values, so
 	 * don't bother reporting these. Frames created by __noreturn functions
 	 * might have a valid FP even if PC is bogus, so only terminate where
 	 * both are NULL.
 	 */
 	if (!frame->fp && !frame->pc)
-		return -EINVAL;
+		return 1;
 
 	return 0;
 }
 
-void notrace walk_stackframe(struct task_struct *tsk, struct stackframe *frame,
+int notrace walk_stackframe(struct task_struct *tsk, struct stackframe *frame,
 		     int (*fn)(struct stackframe *, void *), void *data)
 {
 	while (1) {
 		int ret;
 
-		if (fn(frame, data))
-			break;
+		ret = fn(frame, data);
+		if (ret)
+			return ret;
 		ret = unwind_frame(tsk, frame);
 		if (ret < 0)
+			return ret;
+		if (ret > 0)
 			break;
 	}
+	return 0;
 }
 
 #ifdef CONFIG_STACKTRACE
@@ -145,14 +165,15 @@  void save_stack_trace_regs(struct pt_reg
 		trace->entries[trace->nr_entries++] = ULONG_MAX;
 }
 
-static noinline void __save_stack_trace(struct task_struct *tsk,
+static noinline int __save_stack_trace(struct task_struct *tsk,
 	struct stack_trace *trace, unsigned int nosched)
 {
 	struct stack_trace_data data;
 	struct stackframe frame;
+	int ret;
 
 	if (!try_get_task_stack(tsk))
-		return;
+		return -EBUSY;
 
 	data.trace = trace;
 	data.skip = trace->skip;
@@ -171,11 +192,12 @@  static noinline void __save_stack_trace(
 	frame.graph = tsk->curr_ret_stack;
 #endif
 
-	walk_stackframe(tsk, &frame, save_trace, &data);
+	ret = walk_stackframe(tsk, &frame, save_trace, &data);
 	if (trace->nr_entries < trace->max_entries)
 		trace->entries[trace->nr_entries++] = ULONG_MAX;
 
 	put_task_stack(tsk);
+	return ret;
 }
 EXPORT_SYMBOL_GPL(save_stack_trace_tsk);
 
@@ -190,4 +212,12 @@  void save_stack_trace(struct stack_trace
 }
 
 EXPORT_SYMBOL_GPL(save_stack_trace);
+
+int save_stack_trace_tsk_reliable(struct task_struct *tsk,
+				  struct stack_trace *trace)
+{
+	return __save_stack_trace(tsk, trace, 1);
+}
+EXPORT_SYMBOL_GPL(save_stack_trace_tsk_reliable);
+
 #endif
--- a/arch/arm64/kernel/time.c
+++ b/arch/arm64/kernel/time.c
@@ -56,7 +56,7 @@  unsigned long profile_pc(struct pt_regs
 #endif
 	do {
 		int ret = unwind_frame(NULL, &frame);
-		if (ret < 0)
+		if (ret)
 			return 0;
 	} while (in_lock_functions(frame.pc));
 
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -1178,15 +1178,17 @@  ENTRY(cpu_switch_to)
 ENDPROC(cpu_switch_to)
 NOKPROBE(cpu_switch_to)
 
+	.global	kthread_return_to_user
 /*
  * This is how we return from a fork.
  */
 ENTRY(ret_from_fork)
 	bl	schedule_tail
-	cbz	x19, 1f				// not a kernel thread
+	cbz	x19, kthread_return_to_user	// not a kernel thread
 	mov	x0, x20
 	blr	x19
-1:	get_thread_info tsk
+kthread_return_to_user:
+	get_thread_info tsk
 	b	ret_to_user
 ENDPROC(ret_from_fork)
 NOKPROBE(ret_from_fork)