diff mbox series

[2/8] ARM: ftrace: use ADD not POP to counter PUSH at entry

Message ID 20220125153656.1802079-3-ardb@kernel.org (mailing list archive)
State New, archived
Headers show
Series ARM: ftrace fixes and cleanups | expand

Commit Message

Ard Biesheuvel Jan. 25, 2022, 3:36 p.m. UTC
The compiler emitted hook used for ftrace consists of a PUSH {LR} to
preserve the link register, followed by a branch-and-link (BL) to
__gnu_mount_nc. Dynamic ftrace patches away the latter to turn the
combined sequence into a NOP, using a POP {LR} instruction.

This is not necessary, since the link register does not get clobbered in
this case, and simply adding #4 to the stack pointer is sufficient, and
avoids a memory access that may take a few cycles to resolve depending
on the micro-architecture.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm/kernel/entry-ftrace.S | 2 +-
 arch/arm/kernel/ftrace.c       | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

Comments

Nick Desaulniers Jan. 25, 2022, 7:23 p.m. UTC | #1
On Tue, Jan 25, 2022 at 7:37 AM Ard Biesheuvel <ardb@kernel.org> wrote:
>
> The compiler emitted hook used for ftrace consists of a PUSH {LR} to
> preserve the link register, followed by a branch-and-link (BL) to
> __gnu_mount_nc. Dynamic ftrace patches away the latter to turn the
> combined sequence into a NOP, using a POP {LR} instruction.
>
> This is not necessary, since the link register does not get clobbered in
> this case, and simply adding #4 to the stack pointer is sufficient, and
> avoids a memory access that may take a few cycles to resolve depending
> on the micro-architecture.

Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>

>
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> ---
>  arch/arm/kernel/entry-ftrace.S | 2 +-
>  arch/arm/kernel/ftrace.c       | 4 ++--
>  2 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/arch/arm/kernel/entry-ftrace.S b/arch/arm/kernel/entry-ftrace.S
> index f4886fb6e9ba..dca12a09322a 100644
> --- a/arch/arm/kernel/entry-ftrace.S
> +++ b/arch/arm/kernel/entry-ftrace.S
> @@ -27,7 +27,7 @@
>   * allows it to be clobbered in subroutines and doesn't use it to hold
>   * parameters.)
>   *
> - * When using dynamic ftrace, we patch out the mcount call by a "pop {lr}"
> + * When using dynamic ftrace, we patch out the mcount call by a "add sp, #4"
>   * instead of the __gnu_mcount_nc call (see arch/arm/kernel/ftrace.c).
>   */
>
> diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c
> index a006585e1c09..db72d3a6522d 100644
> --- a/arch/arm/kernel/ftrace.c
> +++ b/arch/arm/kernel/ftrace.c
> @@ -25,9 +25,9 @@
>  #include <asm/patch.h>
>
>  #ifdef CONFIG_THUMB2_KERNEL
> -#define        NOP             0xf85deb04      /* pop.w {lr} */
> +#define        NOP             0xf10d0d04      /* add.w sp, sp, #4 */
>  #else
> -#define        NOP             0xe8bd4000      /* pop {lr} */
> +#define        NOP             0xe28dd004      /* add   sp, sp, #4 */
>  #endif
>
>  #ifdef CONFIG_DYNAMIC_FTRACE
> --
> 2.30.2
>
Linus Walleij Feb. 2, 2022, 11:59 p.m. UTC | #2
On Tue, Jan 25, 2022 at 4:37 PM Ard Biesheuvel <ardb@kernel.org> wrote:

> The compiler emitted hook used for ftrace consists of a PUSH {LR} to
> preserve the link register, followed by a branch-and-link (BL) to
> __gnu_mount_nc. Dynamic ftrace patches away the latter to turn the
> combined sequence into a NOP, using a POP {LR} instruction.
>
> This is not necessary, since the link register does not get clobbered in
> this case, and simply adding #4 to the stack pointer is sufficient, and
> avoids a memory access that may take a few cycles to resolve depending
> on the micro-architecture.
>
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>

That's on the hotpath for ftrace so definitely worth optimizing!
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>

Maybe some of the commit message should go into
the code comment as well? "The compiler hook pushed 4 bytes with
a PUSH{LR}, so discard them by simply adding 4 to the
stack pointer" or so? Just a suggestion.

Yours,
Linus Walleij
diff mbox series

Patch

diff --git a/arch/arm/kernel/entry-ftrace.S b/arch/arm/kernel/entry-ftrace.S
index f4886fb6e9ba..dca12a09322a 100644
--- a/arch/arm/kernel/entry-ftrace.S
+++ b/arch/arm/kernel/entry-ftrace.S
@@ -27,7 +27,7 @@ 
  * allows it to be clobbered in subroutines and doesn't use it to hold
  * parameters.)
  *
- * When using dynamic ftrace, we patch out the mcount call by a "pop {lr}"
+ * When using dynamic ftrace, we patch out the mcount call by a "add sp, #4"
  * instead of the __gnu_mcount_nc call (see arch/arm/kernel/ftrace.c).
  */
 
diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c
index a006585e1c09..db72d3a6522d 100644
--- a/arch/arm/kernel/ftrace.c
+++ b/arch/arm/kernel/ftrace.c
@@ -25,9 +25,9 @@ 
 #include <asm/patch.h>
 
 #ifdef CONFIG_THUMB2_KERNEL
-#define	NOP		0xf85deb04	/* pop.w {lr} */
+#define	NOP		0xf10d0d04	/* add.w sp, sp, #4 */
 #else
-#define	NOP		0xe8bd4000	/* pop {lr} */
+#define	NOP		0xe28dd004	/* add   sp, sp, #4 */
 #endif
 
 #ifdef CONFIG_DYNAMIC_FTRACE