diff mbox series

[RFC,v3,2/4] arm64: Check the return PC against unreliable code sections

Message ID 20210503173615.21576-3-madvenka@linux.microsoft.com (mailing list archive)
State New, archived
Headers show
Series arm64: Stack trace reliability checks in the unwinder | expand

Commit Message

Madhavan T. Venkataraman May 3, 2021, 5:36 p.m. UTC
From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>

Create a sym_code_ranges[] array to cover the following text sections that
contain functions defined as SYM_CODE_*(). These functions are low-level
functions (and do not have a proper frame pointer prolog and epilog). So,
they are inherently unreliable from a stack unwinding perspective.

	.entry.text
	.idmap.text
	.hyp.idmap.text
	.hyp.text
	.hibernate_exit.text
	.entry.tramp.text

If a return PC falls in any of these, mark the stack trace unreliable.

The only exception to this is - if the unwinder has reached the last
frame already, it will not mark the stack trace unreliable since there
is no more unwinding to do. E.g.,

	- ret_from_fork() occurs at the end of the stack trace of
	  kernel tasks.

	- el0_*() functions occur at the end of EL0 exception stack
	  traces. This covers all user task entries into the kernel.

NOTE:
	- EL1 exception handlers are in .entry.text. So, stack traces that
	  contain those functions will be marked not reliable. This covers
	  interrupts, exceptions and breakpoints encountered while executing
	  in the kernel.

	- At the end of an interrupt, the kernel can preempt the current
	  task if required. So, the stack traces of all preempted tasks will
	  show the interrupt frame and will be considered unreliable.

Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
---
 arch/arm64/kernel/stacktrace.c | 54 ++++++++++++++++++++++++++++++++++
 1 file changed, 54 insertions(+)

Comments

Mark Brown May 4, 2021, 4:05 p.m. UTC | #1
On Mon, May 03, 2021 at 12:36:13PM -0500, madvenka@linux.microsoft.com wrote:
> From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
> 
> Create a sym_code_ranges[] array to cover the following text sections that
> contain functions defined as SYM_CODE_*(). These functions are low-level

This makes sense to me - a few of bikesheddy comments below but nothing
really substantive.

> +static struct code_range *lookup_range(unsigned long pc)

This feels like it should have a prefix on the name (eg, unwinder_)
since it looks collision prone.  Or lookup_code_range() rather than just
plain lookup_range().

> +{
+       struct code_range *range;
+         
+       for (range = sym_code_ranges; range->start; range++) {

It seems more idiomatic to use ARRAY_SIZE() rather than a sentinel here,
the array can't be empty.

> +	range = lookup_range(frame->pc);
> +
>  #ifdef CONFIG_FUNCTION_GRAPH_TRACER
>  	if (tsk->ret_stack &&
>  		frame->pc == (unsigned long)return_to_handler) {
> @@ -118,9 +160,21 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
>  			return -EINVAL;
>  		frame->pc = ret_stack->ret;
>  		frame->pc = ptrauth_strip_insn_pac(frame->pc);
> +		return 0;
>  	}

Do we not need to look up the range of the restored pc and validate
what's being pointed to here?  It's not immediately obvious why we do
the lookup before handling the function graph tracer, especially given
that we never look at the result and there's now a return added skipping
further reliability checks.  At the very least I think this needs some
additional comments so the code is more obvious.
Madhavan T. Venkataraman May 4, 2021, 7:03 p.m. UTC | #2
On 5/4/21 11:05 AM, Mark Brown wrote:
> On Mon, May 03, 2021 at 12:36:13PM -0500, madvenka@linux.microsoft.com wrote:
>> From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
>>
>> Create a sym_code_ranges[] array to cover the following text sections that
>> contain functions defined as SYM_CODE_*(). These functions are low-level
> 
> This makes sense to me - a few of bikesheddy comments below but nothing
> really substantive.
> 

OK.

>> +static struct code_range *lookup_range(unsigned long pc)
> 
> This feels like it should have a prefix on the name (eg, unwinder_)
> since it looks collision prone.  Or lookup_code_range() rather than just
> plain lookup_range().
> 

I will add the prefix.

>> +{
> +       struct code_range *range;
> +         
> +       for (range = sym_code_ranges; range->start; range++) {
> 
> It seems more idiomatic to use ARRAY_SIZE() rather than a sentinel here,
> the array can't be empty.
> 

If there is a match, I return the matched range. Else, I return the sentinel.
This is just so I don't have to check for range == NULL after calling
lookup_range().

I will change it to what you have suggested and check for NULL explicitly.
It is not a problem.

>> +	range = lookup_range(frame->pc);
>> +
>>  #ifdef CONFIG_FUNCTION_GRAPH_TRACER
>>  	if (tsk->ret_stack &&
>>  		frame->pc == (unsigned long)return_to_handler) {
>> @@ -118,9 +160,21 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
>>  			return -EINVAL;
>>  		frame->pc = ret_stack->ret;
>>  		frame->pc = ptrauth_strip_insn_pac(frame->pc);
>> +		return 0;
>>  	}
> 
> Do we not need to look up the range of the restored pc and validate
> what's being pointed to here?  It's not immediately obvious why we do
> the lookup before handling the function graph tracer, especially given
> that we never look at the result and there's now a return added skipping
> further reliability checks.  At the very least I think this needs some
> additional comments so the code is more obvious.
I want sym_code_ranges[] to contain both unwindable and non-unwindable ranges.
Unwindable ranges will be special ranges such as the return_to_handler() and
kretprobe_trampoline() functions for which the unwinder has (or will have)
special code to unwind. So, the lookup_range() has to happen before the
function graph code. Please look at the last patch in the series for
the fix for the above function graph code.

On the question of "should the original return address be checked against
sym_code_ranges[]?" - I assumed that if there is a function graph trace on a
function, it had to be an ftraceable function. It would not be a part
of sym_code_ranges[]. Is that a wrong assumption on my part?

Madhavan
Madhavan T. Venkataraman May 4, 2021, 7:32 p.m. UTC | #3
On 5/4/21 2:03 PM, Madhavan T. Venkataraman wrote:
> 
> 
> On 5/4/21 11:05 AM, Mark Brown wrote:
>> On Mon, May 03, 2021 at 12:36:13PM -0500, madvenka@linux.microsoft.com wrote:
>>> From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
>>>
>>> Create a sym_code_ranges[] array to cover the following text sections that
>>> contain functions defined as SYM_CODE_*(). These functions are low-level
>>
>> This makes sense to me - a few of bikesheddy comments below but nothing
>> really substantive.
>>
> 
> OK.
> 
>>> +static struct code_range *lookup_range(unsigned long pc)
>>
>> This feels like it should have a prefix on the name (eg, unwinder_)
>> since it looks collision prone.  Or lookup_code_range() rather than just
>> plain lookup_range().
>>
> 
> I will add the prefix.
> 
>>> +{
>> +       struct code_range *range;
>> +         
>> +       for (range = sym_code_ranges; range->start; range++) {
>>
>> It seems more idiomatic to use ARRAY_SIZE() rather than a sentinel here,
>> the array can't be empty.
>>
> 
> If there is a match, I return the matched range. Else, I return the sentinel.
> This is just so I don't have to check for range == NULL after calling
> lookup_range().
> 
> I will change it to what you have suggested and check for NULL explicitly.
> It is not a problem.
> 
>>> +	range = lookup_range(frame->pc);
>>> +
>>>  #ifdef CONFIG_FUNCTION_GRAPH_TRACER
>>>  	if (tsk->ret_stack &&
>>>  		frame->pc == (unsigned long)return_to_handler) {
>>> @@ -118,9 +160,21 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
>>>  			return -EINVAL;
>>>  		frame->pc = ret_stack->ret;
>>>  		frame->pc = ptrauth_strip_insn_pac(frame->pc);
>>> +		return 0;
>>>  	}
>>
>> Do we not need to look up the range of the restored pc and validate
>> what's being pointed to here?  It's not immediately obvious why we do
>> the lookup before handling the function graph tracer, especially given
>> that we never look at the result and there's now a return added skipping
>> further reliability checks.  At the very least I think this needs some
>> additional comments so the code is more obvious.
> I want sym_code_ranges[] to contain both unwindable and non-unwindable ranges.
> Unwindable ranges will be special ranges such as the return_to_handler() and
> kretprobe_trampoline() functions for which the unwinder has (or will have)
> special code to unwind. So, the lookup_range() has to happen before the
> function graph code. Please look at the last patch in the series for
> the fix for the above function graph code.
> 
> On the question of "should the original return address be checked against
> sym_code_ranges[]?" - I assumed that if there is a function graph trace on a
> function, it had to be an ftraceable function. It would not be a part
> of sym_code_ranges[]. Is that a wrong assumption on my part?
> 

If you prefer, I could do something like this:

check_pc:
	if (!__kernel_text_address(frame->pc))
		frame->reliable = false;

	range = lookup_range(frame->pc);

#ifdef CONFIG_FUNCTION_GRAPH_TRACER
	if (tsk->ret_stack &&
		frame->pc == (unsigned long)return_to_handler) {
		...
		frame->pc = ret_stack->ret;
		frame->pc = ptrauth_strip_insn_pac(frame->pc);
		goto check_pc;
	}
#endif /* CONFIG_FUNCTION_GRAPH_TRACER */

Is that acceptable?

Madhavan
Mark Brown May 5, 2021, 4:34 p.m. UTC | #4
On Tue, May 04, 2021 at 02:03:14PM -0500, Madhavan T. Venkataraman wrote:
> On 5/4/21 11:05 AM, Mark Brown wrote:

> >> @@ -118,9 +160,21 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
> >>  			return -EINVAL;
> >>  		frame->pc = ret_stack->ret;
> >>  		frame->pc = ptrauth_strip_insn_pac(frame->pc);
> >> +		return 0;
> >>  	}

> > Do we not need to look up the range of the restored pc and validate
> > what's being pointed to here?  It's not immediately obvious why we do
> > the lookup before handling the function graph tracer, especially given
> > that we never look at the result and there's now a return added skipping
> > further reliability checks.  At the very least I think this needs some
> > additional comments so the code is more obvious.

> I want sym_code_ranges[] to contain both unwindable and non-unwindable ranges.
> Unwindable ranges will be special ranges such as the return_to_handler() and
> kretprobe_trampoline() functions for which the unwinder has (or will have)
> special code to unwind. So, the lookup_range() has to happen before the
> function graph code. Please look at the last patch in the series for
> the fix for the above function graph code.

That sounds reasonable but like I say should probably be called out in
the code so it's clear to people working with it.

> On the question of "should the original return address be checked against
> sym_code_ranges[]?" - I assumed that if there is a function graph trace on a
> function, it had to be an ftraceable function. It would not be a part
> of sym_code_ranges[]. Is that a wrong assumption on my part?

I can't think of any cases where it wouldn't be right now, but it seems
easier to just do a redundant check than to have the assumption in the
code and have to think about if it's missing.
Mark Brown May 5, 2021, 4:46 p.m. UTC | #5
On Tue, May 04, 2021 at 02:32:35PM -0500, Madhavan T. Venkataraman wrote:

> If you prefer, I could do something like this:
> 
> check_pc:
> 	if (!__kernel_text_address(frame->pc))
> 		frame->reliable = false;
> 
> 	range = lookup_range(frame->pc);
> 
> #ifdef CONFIG_FUNCTION_GRAPH_TRACER
> 	if (tsk->ret_stack &&
> 		frame->pc == (unsigned long)return_to_handler) {
> 		...
> 		frame->pc = ret_stack->ret;
> 		frame->pc = ptrauth_strip_insn_pac(frame->pc);
> 		goto check_pc;
> 	}
> #endif /* CONFIG_FUNCTION_GRAPH_TRACER */

> Is that acceptable?

I think that works even if it's hard to love the goto, might want some
defensiveness to ensure we can't somehow end up in an infinite loop with
a sufficiently badly formed stack.
Madhavan T. Venkataraman May 5, 2021, 5:51 p.m. UTC | #6
On 5/5/21 11:34 AM, Mark Brown wrote:
> On Tue, May 04, 2021 at 02:03:14PM -0500, Madhavan T. Venkataraman wrote:
>> On 5/4/21 11:05 AM, Mark Brown wrote:
> 
>>>> @@ -118,9 +160,21 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
>>>>  			return -EINVAL;
>>>>  		frame->pc = ret_stack->ret;
>>>>  		frame->pc = ptrauth_strip_insn_pac(frame->pc);
>>>> +		return 0;
>>>>  	}
> 
>>> Do we not need to look up the range of the restored pc and validate
>>> what's being pointed to here?  It's not immediately obvious why we do
>>> the lookup before handling the function graph tracer, especially given
>>> that we never look at the result and there's now a return added skipping
>>> further reliability checks.  At the very least I think this needs some
>>> additional comments so the code is more obvious.
> 
>> I want sym_code_ranges[] to contain both unwindable and non-unwindable ranges.
>> Unwindable ranges will be special ranges such as the return_to_handler() and
>> kretprobe_trampoline() functions for which the unwinder has (or will have)
>> special code to unwind. So, the lookup_range() has to happen before the
>> function graph code. Please look at the last patch in the series for
>> the fix for the above function graph code.
> 
> That sounds reasonable but like I say should probably be called out in
> the code so it's clear to people working with it.
> 

OK. To make this better, I will do the lookup_range() after the function
graph code to begin with. Then, in the last patch for the function graph
code, I will move it up. This way, the code is clear and your comment
is addressed.

>> On the question of "should the original return address be checked against
>> sym_code_ranges[]?" - I assumed that if there is a function graph trace on a
>> function, it had to be an ftraceable function. It would not be a part
>> of sym_code_ranges[]. Is that a wrong assumption on my part?
> 
> I can't think of any cases where it wouldn't be right now, but it seems
> easier to just do a redundant check than to have the assumption in the
> code and have to think about if it's missing.
> 

Agreed. Will do the check.

Madhavan
Madhavan T. Venkataraman May 5, 2021, 6:48 p.m. UTC | #7
On 5/5/21 11:46 AM, Mark Brown wrote:
> On Tue, May 04, 2021 at 02:32:35PM -0500, Madhavan T. Venkataraman wrote:
> 
>> If you prefer, I could do something like this:
>>
>> check_pc:
>> 	if (!__kernel_text_address(frame->pc))
>> 		frame->reliable = false;
>>
>> 	range = lookup_range(frame->pc);
>>
>> #ifdef CONFIG_FUNCTION_GRAPH_TRACER
>> 	if (tsk->ret_stack &&
>> 		frame->pc == (unsigned long)return_to_handler) {
>> 		...
>> 		frame->pc = ret_stack->ret;
>> 		frame->pc = ptrauth_strip_insn_pac(frame->pc);
>> 		goto check_pc;
>> 	}
>> #endif /* CONFIG_FUNCTION_GRAPH_TRACER */
> 
>> Is that acceptable?
> 
> I think that works even if it's hard to love the goto, might want some
> defensiveness to ensure we can't somehow end up in an infinite loop with
> a sufficiently badly formed stack.
> 

I could do something like this:

- Move all frame->pc checking code into a function called check_frame_pc().

	bool	check_frame_pc(frame)
	{
		Do all the checks including function graph
		return frame->pc changed
	}

- Then, in unwind_frame()

unwind_frame()
{
	int	i;
	...

	for (i = 0; i < MAX_CHECKS; i++) {
		if (!check_frame(tsk, frame))
			break;
	}

	if (i == MAX_CHECKS)
		frame->reliable = false;
	return 0;
}

The above would take care of future cases like kretprobe_trampoline().

If this is acceptable, then the only question is - what should be the value of
MAX_CHECKS (I will rename it to something more appropriate)?

Madhavan
Madhavan T. Venkataraman May 5, 2021, 6:50 p.m. UTC | #8
On 5/5/21 1:48 PM, Madhavan T. Venkataraman wrote:
> 
> 
> On 5/5/21 11:46 AM, Mark Brown wrote:
>> On Tue, May 04, 2021 at 02:32:35PM -0500, Madhavan T. Venkataraman wrote:
>>
>>> If you prefer, I could do something like this:
>>>
>>> check_pc:
>>> 	if (!__kernel_text_address(frame->pc))
>>> 		frame->reliable = false;
>>>
>>> 	range = lookup_range(frame->pc);
>>>
>>> #ifdef CONFIG_FUNCTION_GRAPH_TRACER
>>> 	if (tsk->ret_stack &&
>>> 		frame->pc == (unsigned long)return_to_handler) {
>>> 		...
>>> 		frame->pc = ret_stack->ret;
>>> 		frame->pc = ptrauth_strip_insn_pac(frame->pc);
>>> 		goto check_pc;
>>> 	}
>>> #endif /* CONFIG_FUNCTION_GRAPH_TRACER */
>>
>>> Is that acceptable?
>>
>> I think that works even if it's hard to love the goto, might want some
>> defensiveness to ensure we can't somehow end up in an infinite loop with
>> a sufficiently badly formed stack.
>>
> 
> I could do something like this:
> 
> - Move all frame->pc checking code into a function called check_frame_pc().
> 
> 	bool	check_frame_pc(frame)
> 	{
> 		Do all the checks including function graph
> 		return frame->pc changed
> 	}
> 
> - Then, in unwind_frame()
> 
> unwind_frame()
> {
> 	int	i;
> 	...
> 
> 	for (i = 0; i < MAX_CHECKS; i++) {
> 		if (!check_frame(tsk, frame))

Small typo in the last statement - It should be check_frame_pc().

Sorry.

Madhavan

> 			break;
> 	}
> 
> 	if (i == MAX_CHECKS)
> 		frame->reliable = false;
> 	return 0;
> }
> 
> The above would take care of future cases like kretprobe_trampoline().
> 
> If this is acceptable, then the only question is - what should be the value of
> MAX_CHECKS (I will rename it to something more appropriate)?
> 
> Madhavan
>
Ard Biesheuvel May 5, 2021, 7:30 p.m. UTC | #9
On Mon, 3 May 2021 at 19:38, <madvenka@linux.microsoft.com> wrote:
>
> From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
>
> Create a sym_code_ranges[] array to cover the following text sections that
> contain functions defined as SYM_CODE_*(). These functions are low-level
> functions (and do not have a proper frame pointer prolog and epilog). So,
> they are inherently unreliable from a stack unwinding perspective.
>
>         .entry.text
>         .idmap.text
>         .hyp.idmap.text
>         .hyp.text
>         .hibernate_exit.text
>         .entry.tramp.text
>
> If a return PC falls in any of these, mark the stack trace unreliable.
>
> The only exception to this is - if the unwinder has reached the last
> frame already, it will not mark the stack trace unreliable since there
> is no more unwinding to do. E.g.,
>
>         - ret_from_fork() occurs at the end of the stack trace of
>           kernel tasks.
>
>         - el0_*() functions occur at the end of EL0 exception stack
>           traces. This covers all user task entries into the kernel.
>
> NOTE:
>         - EL1 exception handlers are in .entry.text. So, stack traces that
>           contain those functions will be marked not reliable. This covers
>           interrupts, exceptions and breakpoints encountered while executing
>           in the kernel.
>
>         - At the end of an interrupt, the kernel can preempt the current
>           task if required. So, the stack traces of all preempted tasks will
>           show the interrupt frame and will be considered unreliable.
>
> Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
> ---
>  arch/arm64/kernel/stacktrace.c | 54 ++++++++++++++++++++++++++++++++++
>  1 file changed, 54 insertions(+)
>
> diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
> index c21a1bca28f3..1ff14615a55a 100644
> --- a/arch/arm64/kernel/stacktrace.c
> +++ b/arch/arm64/kernel/stacktrace.c
> @@ -15,9 +15,48 @@
>
>  #include <asm/irq.h>
>  #include <asm/pointer_auth.h>
> +#include <asm/sections.h>
>  #include <asm/stack_pointer.h>
>  #include <asm/stacktrace.h>
>
> +struct code_range {
> +       unsigned long   start;
> +       unsigned long   end;
> +};
> +
> +struct code_range      sym_code_ranges[] =

This should be static and const

> +{
> +       /* non-unwindable ranges */
> +       { (unsigned long)__entry_text_start,
> +         (unsigned long)__entry_text_end },
> +       { (unsigned long)__idmap_text_start,
> +         (unsigned long)__idmap_text_end },
> +       { (unsigned long)__hyp_idmap_text_start,
> +         (unsigned long)__hyp_idmap_text_end },
> +       { (unsigned long)__hyp_text_start,
> +         (unsigned long)__hyp_text_end },
> +#ifdef CONFIG_HIBERNATION
> +       { (unsigned long)__hibernate_exit_text_start,
> +         (unsigned long)__hibernate_exit_text_end },
> +#endif
> +#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
> +       { (unsigned long)__entry_tramp_text_start,
> +         (unsigned long)__entry_tramp_text_end },
> +#endif
> +       { /* sentinel */ }
> +};
> +
> +static struct code_range *lookup_range(unsigned long pc)

const struct code_range *

> +{
> +       struct code_range *range;

const struct code_range *

> +
> +       for (range = sym_code_ranges; range->start; range++) {
> +               if (pc >= range->start && pc < range->end)
> +                       return range;
> +       }
> +       return range;
> +}
> +
>  /*
>   * AArch64 PCS assigns the frame pointer to x29.
>   *
> @@ -43,6 +82,7 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
>  {
>         unsigned long fp = frame->fp;
>         struct stack_info info;
> +       struct code_range *range;

const struct code_range *

>
>         frame->reliable = true;
>
> @@ -103,6 +143,8 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
>                 return 0;
>         }
>
> +       range = lookup_range(frame->pc);
> +
>  #ifdef CONFIG_FUNCTION_GRAPH_TRACER
>         if (tsk->ret_stack &&
>                 frame->pc == (unsigned long)return_to_handler) {
> @@ -118,9 +160,21 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
>                         return -EINVAL;
>                 frame->pc = ret_stack->ret;
>                 frame->pc = ptrauth_strip_insn_pac(frame->pc);
> +               return 0;
>         }
>  #endif /* CONFIG_FUNCTION_GRAPH_TRACER */
>
> +       if (!range->start)
> +               return 0;
> +
> +       /*
> +        * The return PC falls in an unreliable function. If the final frame
> +        * has been reached, no more unwinding is needed. Otherwise, mark the
> +        * stack trace not reliable.
> +        */
> +       if (frame->fp)
> +               frame->reliable = false;
> +
>         return 0;
>  }
>  NOKPROBE_SYMBOL(unwind_frame);
> --
> 2.25.1
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Madhavan T. Venkataraman May 5, 2021, 8 p.m. UTC | #10
OK. I will make all the changes you suggested.

Thanks!

Madhavan

On 5/5/21 2:30 PM, Ard Biesheuvel wrote:
> On Mon, 3 May 2021 at 19:38, <madvenka@linux.microsoft.com> wrote:
>>
>> From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
>>
>> Create a sym_code_ranges[] array to cover the following text sections that
>> contain functions defined as SYM_CODE_*(). These functions are low-level
>> functions (and do not have a proper frame pointer prolog and epilog). So,
>> they are inherently unreliable from a stack unwinding perspective.
>>
>>         .entry.text
>>         .idmap.text
>>         .hyp.idmap.text
>>         .hyp.text
>>         .hibernate_exit.text
>>         .entry.tramp.text
>>
>> If a return PC falls in any of these, mark the stack trace unreliable.
>>
>> The only exception to this is - if the unwinder has reached the last
>> frame already, it will not mark the stack trace unreliable since there
>> is no more unwinding to do. E.g.,
>>
>>         - ret_from_fork() occurs at the end of the stack trace of
>>           kernel tasks.
>>
>>         - el0_*() functions occur at the end of EL0 exception stack
>>           traces. This covers all user task entries into the kernel.
>>
>> NOTE:
>>         - EL1 exception handlers are in .entry.text. So, stack traces that
>>           contain those functions will be marked not reliable. This covers
>>           interrupts, exceptions and breakpoints encountered while executing
>>           in the kernel.
>>
>>         - At the end of an interrupt, the kernel can preempt the current
>>           task if required. So, the stack traces of all preempted tasks will
>>           show the interrupt frame and will be considered unreliable.
>>
>> Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
>> ---
>>  arch/arm64/kernel/stacktrace.c | 54 ++++++++++++++++++++++++++++++++++
>>  1 file changed, 54 insertions(+)
>>
>> diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
>> index c21a1bca28f3..1ff14615a55a 100644
>> --- a/arch/arm64/kernel/stacktrace.c
>> +++ b/arch/arm64/kernel/stacktrace.c
>> @@ -15,9 +15,48 @@
>>
>>  #include <asm/irq.h>
>>  #include <asm/pointer_auth.h>
>> +#include <asm/sections.h>
>>  #include <asm/stack_pointer.h>
>>  #include <asm/stacktrace.h>
>>
>> +struct code_range {
>> +       unsigned long   start;
>> +       unsigned long   end;
>> +};
>> +
>> +struct code_range      sym_code_ranges[] =
> 
> This should be static and const
> 
>> +{
>> +       /* non-unwindable ranges */
>> +       { (unsigned long)__entry_text_start,
>> +         (unsigned long)__entry_text_end },
>> +       { (unsigned long)__idmap_text_start,
>> +         (unsigned long)__idmap_text_end },
>> +       { (unsigned long)__hyp_idmap_text_start,
>> +         (unsigned long)__hyp_idmap_text_end },
>> +       { (unsigned long)__hyp_text_start,
>> +         (unsigned long)__hyp_text_end },
>> +#ifdef CONFIG_HIBERNATION
>> +       { (unsigned long)__hibernate_exit_text_start,
>> +         (unsigned long)__hibernate_exit_text_end },
>> +#endif
>> +#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
>> +       { (unsigned long)__entry_tramp_text_start,
>> +         (unsigned long)__entry_tramp_text_end },
>> +#endif
>> +       { /* sentinel */ }
>> +};
>> +
>> +static struct code_range *lookup_range(unsigned long pc)
> 
> const struct code_range *
> 
>> +{
>> +       struct code_range *range;
> 
> const struct code_range *
> 
>> +
>> +       for (range = sym_code_ranges; range->start; range++) {
>> +               if (pc >= range->start && pc < range->end)
>> +                       return range;
>> +       }
>> +       return range;
>> +}
>> +
>>  /*
>>   * AArch64 PCS assigns the frame pointer to x29.
>>   *
>> @@ -43,6 +82,7 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
>>  {
>>         unsigned long fp = frame->fp;
>>         struct stack_info info;
>> +       struct code_range *range;
> 
> const struct code_range *
> 
>>
>>         frame->reliable = true;
>>
>> @@ -103,6 +143,8 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
>>                 return 0;
>>         }
>>
>> +       range = lookup_range(frame->pc);
>> +
>>  #ifdef CONFIG_FUNCTION_GRAPH_TRACER
>>         if (tsk->ret_stack &&
>>                 frame->pc == (unsigned long)return_to_handler) {
>> @@ -118,9 +160,21 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
>>                         return -EINVAL;
>>                 frame->pc = ret_stack->ret;
>>                 frame->pc = ptrauth_strip_insn_pac(frame->pc);
>> +               return 0;
>>         }
>>  #endif /* CONFIG_FUNCTION_GRAPH_TRACER */
>>
>> +       if (!range->start)
>> +               return 0;
>> +
>> +       /*
>> +        * The return PC falls in an unreliable function. If the final frame
>> +        * has been reached, no more unwinding is needed. Otherwise, mark the
>> +        * stack trace not reliable.
>> +        */
>> +       if (frame->fp)
>> +               frame->reliable = false;
>> +
>>         return 0;
>>  }
>>  NOKPROBE_SYMBOL(unwind_frame);
>> --
>> 2.25.1
>>
>>
>> _______________________________________________
>> linux-arm-kernel mailing list
>> linux-arm-kernel@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Mark Brown May 6, 2021, 1:45 p.m. UTC | #11
On Wed, May 05, 2021 at 01:48:21PM -0500, Madhavan T. Venkataraman wrote:
> On 5/5/21 11:46 AM, Mark Brown wrote:

> > I think that works even if it's hard to love the goto, might want some
> > defensiveness to ensure we can't somehow end up in an infinite loop with
> > a sufficiently badly formed stack.

> I could do something like this:

> unwind_frame()
> {
> 	int	i;
> 	...
> 
> 	for (i = 0; i < MAX_CHECKS; i++) {
> 		if (!check_frame(tsk, frame))
> 			break;
> 	}

I think that could work, yes.  Have to see the actual code (and other
people's opinions!).

> If this is acceptable, then the only question is - what should be the value of
> MAX_CHECKS (I will rename it to something more appropriate)?

I'd expect something like 10 to be way more than we'd ever need, or we
could define it down to the 2 checks we expect to be possible ATM to be
conservative.  I'm tempted to be permissive if we have sufficient other
checks but I'm not 100% sure on that.
Madhavan T. Venkataraman May 6, 2021, 3:21 p.m. UTC | #12
On 5/6/21 8:45 AM, Mark Brown wrote:
> On Wed, May 05, 2021 at 01:48:21PM -0500, Madhavan T. Venkataraman wrote:
>> On 5/5/21 11:46 AM, Mark Brown wrote:
> 
>>> I think that works even if it's hard to love the goto, might want some
>>> defensiveness to ensure we can't somehow end up in an infinite loop with
>>> a sufficiently badly formed stack.
> 
>> I could do something like this:
> 
>> unwind_frame()
>> {
>> 	int	i;
>> 	...
>>
>> 	for (i = 0; i < MAX_CHECKS; i++) {
>> 		if (!check_frame(tsk, frame))
>> 			break;
>> 	}
> 
> I think that could work, yes.  Have to see the actual code (and other
> people's opinions!).
> 
>> If this is acceptable, then the only question is - what should be the value of
>> MAX_CHECKS (I will rename it to something more appropriate)?
> 
> I'd expect something like 10 to be way more than we'd ever need, or we
> could define it down to the 2 checks we expect to be possible ATM to be
> conservative.  I'm tempted to be permissive if we have sufficient other
> checks but I'm not 100% sure on that.
> 

OK. I will implement these changes for version 4 and send it out so this
whole thing can be reviewed again with the actual changes in front of us.

Madhavan
diff mbox series

Patch

diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index c21a1bca28f3..1ff14615a55a 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -15,9 +15,48 @@ 
 
 #include <asm/irq.h>
 #include <asm/pointer_auth.h>
+#include <asm/sections.h>
 #include <asm/stack_pointer.h>
 #include <asm/stacktrace.h>
 
+struct code_range {
+	unsigned long	start;
+	unsigned long	end;
+};
+
+struct code_range	sym_code_ranges[] =
+{
+	/* non-unwindable ranges */
+	{ (unsigned long)__entry_text_start,
+	  (unsigned long)__entry_text_end },
+	{ (unsigned long)__idmap_text_start,
+	  (unsigned long)__idmap_text_end },
+	{ (unsigned long)__hyp_idmap_text_start,
+	  (unsigned long)__hyp_idmap_text_end },
+	{ (unsigned long)__hyp_text_start,
+	  (unsigned long)__hyp_text_end },
+#ifdef CONFIG_HIBERNATION
+	{ (unsigned long)__hibernate_exit_text_start,
+	  (unsigned long)__hibernate_exit_text_end },
+#endif
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
+	{ (unsigned long)__entry_tramp_text_start,
+	  (unsigned long)__entry_tramp_text_end },
+#endif
+	{ /* sentinel */ }
+};
+
+static struct code_range *lookup_range(unsigned long pc)
+{
+	struct code_range *range;
+
+	for (range = sym_code_ranges; range->start; range++) {
+		if (pc >= range->start && pc < range->end)
+			return range;
+	}
+	return range;
+}
+
 /*
  * AArch64 PCS assigns the frame pointer to x29.
  *
@@ -43,6 +82,7 @@  int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
 {
 	unsigned long fp = frame->fp;
 	struct stack_info info;
+	struct code_range *range;
 
 	frame->reliable = true;
 
@@ -103,6 +143,8 @@  int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
 		return 0;
 	}
 
+	range = lookup_range(frame->pc);
+
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 	if (tsk->ret_stack &&
 		frame->pc == (unsigned long)return_to_handler) {
@@ -118,9 +160,21 @@  int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
 			return -EINVAL;
 		frame->pc = ret_stack->ret;
 		frame->pc = ptrauth_strip_insn_pac(frame->pc);
+		return 0;
 	}
 #endif /* CONFIG_FUNCTION_GRAPH_TRACER */
 
+	if (!range->start)
+		return 0;
+
+	/*
+	 * The return PC falls in an unreliable function. If the final frame
+	 * has been reached, no more unwinding is needed. Otherwise, mark the
+	 * stack trace not reliable.
+	 */
+	if (frame->fp)
+		frame->reliable = false;
+
 	return 0;
 }
 NOKPROBE_SYMBOL(unwind_frame);