diff mbox series

[v10,02/11] arm64: Make perf_callchain_kernel() use arch_stack_walk()

Message ID 20211015025847.17694-3-madvenka@linux.microsoft.com (mailing list archive)
State New, archived
Headers show
Series arm64: Reorganize the unwinder and implement stack trace reliability checks | expand

Commit Message

Madhavan T. Venkataraman Oct. 15, 2021, 2:58 a.m. UTC
From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>

Currently, perf_callchain_kernel() in ARM64 code walks the stack using
start_backtrace() and walk_stackframe(). Make it use arch_stack_walk()
instead. This makes maintenance easier.

Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
---
 arch/arm64/kernel/perf_callchain.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

Comments

Mark Brown Oct. 20, 2021, 2:59 p.m. UTC | #1
On Thu, Oct 14, 2021 at 09:58:38PM -0500, madvenka@linux.microsoft.com wrote:
> From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
> 
> Currently, perf_callchain_kernel() in ARM64 code walks the stack using
> start_backtrace() and walk_stackframe(). Make it use arch_stack_walk()
> instead. This makes maintenance easier.

>  static bool callchain_trace(void *data, unsigned long pc)
>  {
>  	struct perf_callchain_entry_ctx *entry = data;
> -	perf_callchain_store(entry, pc);
> -	return true;
> +	return perf_callchain_store(entry, pc) == 0;
>  }

This changes us from unconditionally doing the whole walk to returning
an error if perf_callchain_store() returns an error so it's not quite a
straight transform, though since that seems like a useful improvement
which most likely  on't have any practical impact that's fine.

Reviewed-by: Mark Brown <broonie@kernel.org>
Madhavan T. Venkataraman Oct. 21, 2021, 12:28 p.m. UTC | #2
On 10/20/21 9:59 AM, Mark Brown wrote:
> On Thu, Oct 14, 2021 at 09:58:38PM -0500, madvenka@linux.microsoft.com wrote:
>> From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
>>
>> Currently, perf_callchain_kernel() in ARM64 code walks the stack using
>> start_backtrace() and walk_stackframe(). Make it use arch_stack_walk()
>> instead. This makes maintenance easier.
> 
>>  static bool callchain_trace(void *data, unsigned long pc)
>>  {
>>  	struct perf_callchain_entry_ctx *entry = data;
>> -	perf_callchain_store(entry, pc);
>> -	return true;
>> +	return perf_callchain_store(entry, pc) == 0;
>>  }
> 
> This changes us from unconditionally doing the whole walk to returning
> an error if perf_callchain_store() returns an error so it's not quite a
> straight transform, though since that seems like a useful improvement
> which most likely  on't have any practical impact that's fine.
> 
> Reviewed-by: Mark Brown <broonie@kernel.org>
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 

Thanks.

Madhavan
Mark Rutland Oct. 22, 2021, 6:11 p.m. UTC | #3
On Thu, Oct 14, 2021 at 09:58:38PM -0500, madvenka@linux.microsoft.com wrote:
> From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
> 
> Currently, perf_callchain_kernel() in ARM64 code walks the stack using
> start_backtrace() and walk_stackframe(). Make it use arch_stack_walk()
> instead. This makes maintenance easier.
> 
> Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>

This looks good to me; bailing out when perf_callchain_store() can't
accept any more entries absolutely makes sense.

I gave this a spin with:

| #  perf record -g -c1 ls
| #  perf report

... and the recorded callchains look sane.

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>

As mentioned on patch 1, I'd like to get this rebased atop Peter's
untangling of ARCH_STACKWALK from STACKTRACE.

Thanks,
Mark.

> ---
>  arch/arm64/kernel/perf_callchain.c | 8 ++------
>  1 file changed, 2 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/arm64/kernel/perf_callchain.c b/arch/arm64/kernel/perf_callchain.c
> index 4a72c2727309..f173c448e852 100644
> --- a/arch/arm64/kernel/perf_callchain.c
> +++ b/arch/arm64/kernel/perf_callchain.c
> @@ -140,22 +140,18 @@ void perf_callchain_user(struct perf_callchain_entry_ctx *entry,
>  static bool callchain_trace(void *data, unsigned long pc)
>  {
>  	struct perf_callchain_entry_ctx *entry = data;
> -	perf_callchain_store(entry, pc);
> -	return true;
> +	return perf_callchain_store(entry, pc) == 0;
>  }
>  
>  void perf_callchain_kernel(struct perf_callchain_entry_ctx *entry,
>  			   struct pt_regs *regs)
>  {
> -	struct stackframe frame;
> -
>  	if (perf_guest_cbs && perf_guest_cbs->is_in_guest()) {
>  		/* We don't support guest os callchain now */
>  		return;
>  	}
>  
> -	start_backtrace(&frame, regs->regs[29], regs->pc);
> -	walk_stackframe(current, &frame, callchain_trace, entry);
> +	arch_stack_walk(callchain_trace, entry, current, regs);
>  }
>  
>  unsigned long perf_instruction_pointer(struct pt_regs *regs)
> -- 
> 2.25.1
>
Madhavan T. Venkataraman Oct. 23, 2021, 12:49 p.m. UTC | #4
On 10/22/21 1:11 PM, Mark Rutland wrote:
> On Thu, Oct 14, 2021 at 09:58:38PM -0500, madvenka@linux.microsoft.com wrote:
>> From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
>>
>> Currently, perf_callchain_kernel() in ARM64 code walks the stack using
>> start_backtrace() and walk_stackframe(). Make it use arch_stack_walk()
>> instead. This makes maintenance easier.
>>
>> Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
> 
> This looks good to me; bailing out when perf_callchain_store() can't
> accept any more entries absolutely makes sense.
> 
> I gave this a spin with:
> 
> | #  perf record -g -c1 ls
> | #  perf report
> 
> ... and the recorded callchains look sane.
> 
> Reviewed-by: Mark Rutland <mark.rutland@arm.com>
> Tested-by: Mark Rutland <mark.rutland@arm.com>
> 

Thanks a lot!

> As mentioned on patch 1, I'd like to get this rebased atop Peter's
> untangling of ARCH_STACKWALK from STACKTRACE.
> 

Will do.

Thanks.

Madhavan
diff mbox series

Patch

diff --git a/arch/arm64/kernel/perf_callchain.c b/arch/arm64/kernel/perf_callchain.c
index 4a72c2727309..f173c448e852 100644
--- a/arch/arm64/kernel/perf_callchain.c
+++ b/arch/arm64/kernel/perf_callchain.c
@@ -140,22 +140,18 @@  void perf_callchain_user(struct perf_callchain_entry_ctx *entry,
 static bool callchain_trace(void *data, unsigned long pc)
 {
 	struct perf_callchain_entry_ctx *entry = data;
-	perf_callchain_store(entry, pc);
-	return true;
+	return perf_callchain_store(entry, pc) == 0;
 }
 
 void perf_callchain_kernel(struct perf_callchain_entry_ctx *entry,
 			   struct pt_regs *regs)
 {
-	struct stackframe frame;
-
 	if (perf_guest_cbs && perf_guest_cbs->is_in_guest()) {
 		/* We don't support guest os callchain now */
 		return;
 	}
 
-	start_backtrace(&frame, regs->regs[29], regs->pc);
-	walk_stackframe(current, &frame, callchain_trace, entry);
+	arch_stack_walk(callchain_trace, entry, current, regs);
 }
 
 unsigned long perf_instruction_pointer(struct pt_regs *regs)