Message ID | 20220125153656.1802079-3-ardb@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | ARM: ftrace fixes and cleanups | expand |
On Tue, Jan 25, 2022 at 7:37 AM Ard Biesheuvel <ardb@kernel.org> wrote: > > The compiler emitted hook used for ftrace consists of a PUSH {LR} to > preserve the link register, followed by a branch-and-link (BL) to > __gnu_mount_nc. Dynamic ftrace patches away the latter to turn the > combined sequence into a NOP, using a POP {LR} instruction. > > This is not necessary, since the link register does not get clobbered in > this case, and simply adding #4 to the stack pointer is sufficient, and > avoids a memory access that may take a few cycles to resolve depending > on the micro-architecture. Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> > > Signed-off-by: Ard Biesheuvel <ardb@kernel.org> > --- > arch/arm/kernel/entry-ftrace.S | 2 +- > arch/arm/kernel/ftrace.c | 4 ++-- > 2 files changed, 3 insertions(+), 3 deletions(-) > > diff --git a/arch/arm/kernel/entry-ftrace.S b/arch/arm/kernel/entry-ftrace.S > index f4886fb6e9ba..dca12a09322a 100644 > --- a/arch/arm/kernel/entry-ftrace.S > +++ b/arch/arm/kernel/entry-ftrace.S > @@ -27,7 +27,7 @@ > * allows it to be clobbered in subroutines and doesn't use it to hold > * parameters.) > * > - * When using dynamic ftrace, we patch out the mcount call by a "pop {lr}" > + * When using dynamic ftrace, we patch out the mcount call by a "add sp, #4" > * instead of the __gnu_mcount_nc call (see arch/arm/kernel/ftrace.c). > */ > > diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c > index a006585e1c09..db72d3a6522d 100644 > --- a/arch/arm/kernel/ftrace.c > +++ b/arch/arm/kernel/ftrace.c > @@ -25,9 +25,9 @@ > #include <asm/patch.h> > > #ifdef CONFIG_THUMB2_KERNEL > -#define NOP 0xf85deb04 /* pop.w {lr} */ > +#define NOP 0xf10d0d04 /* add.w sp, sp, #4 */ > #else > -#define NOP 0xe8bd4000 /* pop {lr} */ > +#define NOP 0xe28dd004 /* add sp, sp, #4 */ > #endif > > #ifdef CONFIG_DYNAMIC_FTRACE > -- > 2.30.2 >
On Tue, Jan 25, 2022 at 4:37 PM Ard Biesheuvel <ardb@kernel.org> wrote: > The compiler emitted hook used for ftrace consists of a PUSH {LR} to > preserve the link register, followed by a branch-and-link (BL) to > __gnu_mount_nc. Dynamic ftrace patches away the latter to turn the > combined sequence into a NOP, using a POP {LR} instruction. > > This is not necessary, since the link register does not get clobbered in > this case, and simply adding #4 to the stack pointer is sufficient, and > avoids a memory access that may take a few cycles to resolve depending > on the micro-architecture. > > Signed-off-by: Ard Biesheuvel <ardb@kernel.org> That's on the hotpath for ftrace so definitely worth optimizing! Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Maybe some of the commit message should go into the code comment as well? "The compiler hook pushed 4 bytes with a PUSH{LR}, so discard them by simply adding 4 to the stack pointer" or so? Just a suggestion. Yours, Linus Walleij
diff --git a/arch/arm/kernel/entry-ftrace.S b/arch/arm/kernel/entry-ftrace.S index f4886fb6e9ba..dca12a09322a 100644 --- a/arch/arm/kernel/entry-ftrace.S +++ b/arch/arm/kernel/entry-ftrace.S @@ -27,7 +27,7 @@ * allows it to be clobbered in subroutines and doesn't use it to hold * parameters.) * - * When using dynamic ftrace, we patch out the mcount call by a "pop {lr}" + * When using dynamic ftrace, we patch out the mcount call by a "add sp, #4" * instead of the __gnu_mcount_nc call (see arch/arm/kernel/ftrace.c). */ diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c index a006585e1c09..db72d3a6522d 100644 --- a/arch/arm/kernel/ftrace.c +++ b/arch/arm/kernel/ftrace.c @@ -25,9 +25,9 @@ #include <asm/patch.h> #ifdef CONFIG_THUMB2_KERNEL -#define NOP 0xf85deb04 /* pop.w {lr} */ +#define NOP 0xf10d0d04 /* add.w sp, sp, #4 */ #else -#define NOP 0xe8bd4000 /* pop {lr} */ +#define NOP 0xe28dd004 /* add sp, sp, #4 */ #endif #ifdef CONFIG_DYNAMIC_FTRACE
The compiler emitted hook used for ftrace consists of a PUSH {LR} to preserve the link register, followed by a branch-and-link (BL) to __gnu_mount_nc. Dynamic ftrace patches away the latter to turn the combined sequence into a NOP, using a POP {LR} instruction. This is not necessary, since the link register does not get clobbered in this case, and simply adding #4 to the stack pointer is sufficient, and avoids a memory access that may take a few cycles to resolve depending on the micro-architecture. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> --- arch/arm/kernel/entry-ftrace.S | 2 +- arch/arm/kernel/ftrace.c | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-)