diff mbox series

[v15,6/6] arm64: Introduce arch_stack_walk_reliable()

Message ID 20220617210717.27126-7-madvenka@linux.microsoft.com (mailing list archive)
State New, archived
Headers show
Series arm64: Reorganize the unwinder and implement stack trace reliability checks | expand

Commit Message

Madhavan T. Venkataraman June 17, 2022, 9:07 p.m. UTC
From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>

Introduce arch_stack_walk_reliable() for ARM64. This works like
arch_stack_walk() except that it returns -EINVAL if the stack trace is not
reliable.

Until all the reliability checks are in place, arch_stack_walk_reliable()
may not be used by livepatch. But it may be used by debug and test code.

Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
---
 arch/arm64/kernel/stacktrace.c | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

Comments

Mark Rutland June 26, 2022, 8:57 a.m. UTC | #1
On Fri, Jun 17, 2022 at 04:07:17PM -0500, madvenka@linux.microsoft.com wrote:
> From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
> 
> Introduce arch_stack_walk_reliable() for ARM64. This works like
> arch_stack_walk() except that it returns -EINVAL if the stack trace is not
> reliable.
> 
> Until all the reliability checks are in place, arch_stack_walk_reliable()
> may not be used by livepatch. But it may be used by debug and test code.

For the moment I would strongly perfer *not* to add this until we have the
missing bits and pieces sorted out.

Until then, I'd like to ensure that any infrastructure we add is immediately
useful and tested. One way to do that would be to enhance the stack dumping
code (i.e. dump_backtrace()) to log some metadata.

As an end-goal, I'd like to get to a point where we can do:

* Explicit logging when trace terminate at the final frame, e.g.

  stacktrace:
    function_c+offset/total
    function_b+offset/total
    function_a+offset/total
    <unwind successful>

* Explicit logging of early termination, e.g.

  stacktrace:
    function_c+offset/total
    <unwind terminated early (bad FP)>

* Unreliability on individual elements, e.g.

  stacktrace:
    function_c+offset/total
    function_b+offset/total (?)
    function_a+offset/total

* Annotations for special unwinding, e.g.

  stacktrace:
    function_c+offset/total (K) // kretprobes trampoline
    function_b+offset/total (F) // ftrace trampoline
    function_a+offset/total (FK) // ftrace and kretprobes
    other_function+offset/total (P) // from pt_regs::pc
    another_function+offset/total (L?) // from pt_regs::lr, unreliable
    something_else+offset/total

  Note: the comments here are just to explain the idea, I don't expect those in
  the actual output.

That'll justify some of the infrastructure we need for reliable unwinding, and
ensure that it is tested, well before we actually enable reliable stacktracing.

Thanks,
Mark.

> 
> Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
> Reviewed-by: Mark Brown <broonie@kernel.org>
> ---
>  arch/arm64/kernel/stacktrace.c | 23 +++++++++++++++++++++++
>  1 file changed, 23 insertions(+)
> 
> diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
> index eda8581f7dbe..8016ba0e2c96 100644
> --- a/arch/arm64/kernel/stacktrace.c
> +++ b/arch/arm64/kernel/stacktrace.c
> @@ -383,3 +383,26 @@ noinline notrace void arch_stack_walk(stack_trace_consume_fn consume_entry,
>  
>  	unwind(&state, consume_entry, cookie);
>  }
> +
> +/*
> + * arch_stack_walk_reliable() may not be used for livepatch until all of
> + * the reliability checks are in place in unwind_consume(). However,
> + * debug and test code can choose to use it even if all the checks are not
> + * in place.
> + */
> +noinline int notrace arch_stack_walk_reliable(
> +					stack_trace_consume_fn consume_entry,
> +					void *cookie,
> +					struct task_struct *task)
> +{
> +	struct unwind_state state;
> +	bool reliable;
> +
> +	if (task == current)
> +		unwind_init_from_caller(&state);
> +	else
> +		unwind_init_from_task(&state, task);
> +
> +	reliable = unwind(&state, consume_entry, cookie);
> +	return reliable ? 0 : -EINVAL;
> +}
> -- 
> 2.25.1
>
Madhavan T. Venkataraman June 27, 2022, 5:53 a.m. UTC | #2
On 6/26/22 03:57, Mark Rutland wrote:
> On Fri, Jun 17, 2022 at 04:07:17PM -0500, madvenka@linux.microsoft.com wrote:
>> From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
>>
>> Introduce arch_stack_walk_reliable() for ARM64. This works like
>> arch_stack_walk() except that it returns -EINVAL if the stack trace is not
>> reliable.
>>
>> Until all the reliability checks are in place, arch_stack_walk_reliable()
>> may not be used by livepatch. But it may be used by debug and test code.
> 
> For the moment I would strongly perfer *not* to add this until we have the
> missing bits and pieces sorted out.
> 

Yes. I am removing this from the patch series.

> Until then, I'd like to ensure that any infrastructure we add is immediately
> useful and tested. One way to do that would be to enhance the stack dumping
> code (i.e. dump_backtrace()) to log some metadata.
> 
> As an end-goal, I'd like to get to a point where we can do:
> 
> * Explicit logging when trace terminate at the final frame, e.g.
> 
>   stacktrace:
>     function_c+offset/total
>     function_b+offset/total
>     function_a+offset/total
>     <unwind successful>
> 
> * Explicit logging of early termination, e.g.
> 
>   stacktrace:
>     function_c+offset/total
>     <unwind terminated early (bad FP)>
> 
> * Unreliability on individual elements, e.g.
> 
>   stacktrace:
>     function_c+offset/total
>     function_b+offset/total (?)
>     function_a+offset/total
> 
> * Annotations for special unwinding, e.g.
> 
>   stacktrace:
>     function_c+offset/total (K) // kretprobes trampoline
>     function_b+offset/total (F) // ftrace trampoline
>     function_a+offset/total (FK) // ftrace and kretprobes
>     other_function+offset/total (P) // from pt_regs::pc
>     another_function+offset/total (L?) // from pt_regs::lr, unreliable
>     something_else+offset/total
> 
>   Note: the comments here are just to explain the idea, I don't expect those in
>   the actual output.
> 
> That'll justify some of the infrastructure we need for reliable unwinding, and
> ensure that it is tested, well before we actually enable reliable stacktracing.
> 

In the current code structure, the annotations are a problem.

The printing of the entry along with the annotations and metadata cannot be done in
the unwind functions themselves as the caller may not even want anything printed.
The printing has to be done in consume_entry() if the caller wants to do it. But
consume_entry() only gets the PC as the argument (apart from the cookie passed by
the caller). It currently has no way of figuring out where the PC was obtained from
(ftrace, kretprobe, pt_regs, etc) or if the PC is reliable.

We need to replace the PC argument with a pointer to a structure that contains the
PC as well as other information about the PC. unwind_init() and unwind_next() need
to update that for each frame.

If this approach is acceptable, I will submit a patch series for that. Please let
me know.

Thanks.

Madhavan
diff mbox series

Patch

diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index eda8581f7dbe..8016ba0e2c96 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -383,3 +383,26 @@  noinline notrace void arch_stack_walk(stack_trace_consume_fn consume_entry,
 
 	unwind(&state, consume_entry, cookie);
 }
+
+/*
+ * arch_stack_walk_reliable() may not be used for livepatch until all of
+ * the reliability checks are in place in unwind_consume(). However,
+ * debug and test code can choose to use it even if all the checks are not
+ * in place.
+ */
+noinline int notrace arch_stack_walk_reliable(
+					stack_trace_consume_fn consume_entry,
+					void *cookie,
+					struct task_struct *task)
+{
+	struct unwind_state state;
+	bool reliable;
+
+	if (task == current)
+		unwind_init_from_caller(&state);
+	else
+		unwind_init_from_task(&state, task);
+
+	reliable = unwind(&state, consume_entry, cookie);
+	return reliable ? 0 : -EINVAL;
+}