mbox series

[v5,00/17] powerpc: Core ftrace rework, support for ftrace direct and bpf trampolines

Message ID 20240915205648.830121-1-hbathini@linux.ibm.com (mailing list archive)
Headers show
Series powerpc: Core ftrace rework, support for ftrace direct and bpf trampolines | expand

Message

Hari Bathini Sept. 15, 2024, 8:56 p.m. UTC
This is v5 of the series posted here:
https://lore.kernel.org/all/cover.1720942106.git.naveen@kernel.org/

This series reworks core ftrace support on powerpc to have the function
profiling sequence moved out of line. This enables us to have a single
nop at kernel function entry virtually eliminating effect of the
function tracer when it is not enabled. The function profile sequence is
moved out of line and is allocated at two separate places depending on a
new config option.

For 64-bit powerpc, the function profiling sequence is also updated to
include an additional instruction 'mtlr r0' after the usual
two-instruction sequence to fix link stack imbalance (return address
predictor) when ftrace is enabled. This showed an improvement of ~10%
in null_syscall benchmark (NR_LOOPS=10000000) on a Power 10 system
with ftrace enabled.

Finally, support for ftrace direct calls is added based on support for
DYNAMIC_FTRACE_WITH_CALL_OPS. BPF Trampoline support is added atop this.

Support for ftrace direct calls is added for 32-bit powerpc. There is
some code to enable bpf trampolines for 32-bit powerpc, but it is not
complete and will need to be pursued separately.

Patches 1 to 10 are independent of this series and can go in separately
though. Rest of the patches depend on the series from Benjamin Gray
adding support for patch_uint() and patch_ulong():
https://lore.kernel.org/all/172474280311.31690.1489687786264785049.b4-ty@ellerman.id.au/

Changelog v5:
* Intermediate files named .vmlinux.arch.* instead of .arch.vmlinux.*
* Fixed ftrace stack tracer failure due to inadvertent use of
  'add r7, r3, MCOUNT_INSN_SIZE' instruction instead of
  'addi r7, r3, MCOUNT_INSN_SIZE'
* Fixed build error for !CONFIG_MODULES case.
* .vmlinux.arch.* files compiled under arch/powerpc/tools
* Made sure .vmlinux.arch.* files are cleaned with `make clean`
* num_ool_stubs_text_end used for setting up ftrace_ool_stub_text_end
  set to zero instead of computing to some random negative value when
  not required.
* Resolved checkpatch.pl warnings.
* Dropped RFC tag.

Changelog v4:
- Patches 1, 10 and 13 are new.
- Address review comments from Nick. Numerous changes throughout the
  patch series.
- Extend support for ftrace ool to vmlinux text up to 64MB (patch 13).
- Address remaining TODOs in support for BPF Trampolines.
- Update synchronization when patching instructions during trampoline
  attach/detach.


Naveen N Rao (17):
  powerpc/trace: Account for -fpatchable-function-entry support by
    toolchain
  powerpc/kprobes: Use ftrace to determine if a probe is at function
    entry
  powerpc64/ftrace: Nop out additional 'std' instruction emitted by gcc
    v5.x
  powerpc32/ftrace: Unify 32-bit and 64-bit ftrace entry code
  powerpc/module_64: Convert #ifdef to IS_ENABLED()
  powerpc/ftrace: Remove pointer to struct module from dyn_arch_ftrace
  powerpc/ftrace: Skip instruction patching if the instructions are the
    same
  powerpc/ftrace: Move ftrace stub used for init text before _einittext
  powerpc64/bpf: Fold bpf_jit_emit_func_call_hlp() into
    bpf_jit_emit_func_call_rel()
  powerpc/ftrace: Add a postlink script to validate function tracer
  kbuild: Add generic hook for architectures to use before the final
    vmlinux link
  powerpc64/ftrace: Move ftrace sequence out of line
  powerpc64/ftrace: Support .text larger than 32MB with out-of-line
    stubs
  powerpc/ftrace: Add support for DYNAMIC_FTRACE_WITH_CALL_OPS
  powerpc/ftrace: Add support for DYNAMIC_FTRACE_WITH_DIRECT_CALLS
  samples/ftrace: Add support for ftrace direct samples on powerpc
  powerpc64/bpf: Add support for bpf trampolines

 arch/Kconfig                                |   6 +
 arch/powerpc/Kbuild                         |   2 +-
 arch/powerpc/Kconfig                        |  23 +-
 arch/powerpc/Makefile                       |   8 +
 arch/powerpc/Makefile.postlink              |   8 +
 arch/powerpc/include/asm/ftrace.h           |  33 +-
 arch/powerpc/include/asm/module.h           |   5 +
 arch/powerpc/include/asm/ppc-opcode.h       |  14 +
 arch/powerpc/kernel/asm-offsets.c           |  11 +
 arch/powerpc/kernel/kprobes.c               |  18 +-
 arch/powerpc/kernel/module_64.c             |  66 +-
 arch/powerpc/kernel/trace/Makefile          |  11 +-
 arch/powerpc/kernel/trace/ftrace.c          | 298 ++++++-
 arch/powerpc/kernel/trace/ftrace_64_pg.c    |  69 +-
 arch/powerpc/kernel/trace/ftrace_entry.S    | 244 ++++--
 arch/powerpc/kernel/vmlinux.lds.S           |   3 +-
 arch/powerpc/net/bpf_jit.h                  |  12 +
 arch/powerpc/net/bpf_jit_comp.c             | 847 +++++++++++++++++++-
 arch/powerpc/net/bpf_jit_comp32.c           |   7 +-
 arch/powerpc/net/bpf_jit_comp64.c           |  68 +-
 arch/powerpc/tools/Makefile                 |  12 +
 arch/powerpc/tools/ftrace-gen-ool-stubs.sh  |  52 ++
 arch/powerpc/tools/ftrace_check.sh          |  50 ++
 samples/ftrace/ftrace-direct-modify.c       |  85 +-
 samples/ftrace/ftrace-direct-multi-modify.c | 101 ++-
 samples/ftrace/ftrace-direct-multi.c        |  79 +-
 samples/ftrace/ftrace-direct-too.c          |  83 +-
 samples/ftrace/ftrace-direct.c              |  69 +-
 scripts/Makefile.vmlinux                    |   7 +
 scripts/link-vmlinux.sh                     |   7 +-
 30 files changed, 2098 insertions(+), 200 deletions(-)
 create mode 100644 arch/powerpc/tools/Makefile
 create mode 100755 arch/powerpc/tools/ftrace-gen-ool-stubs.sh
 create mode 100755 arch/powerpc/tools/ftrace_check.sh

Comments

Masahiro Yamada Oct. 9, 2024, 3:46 p.m. UTC | #1
On Mon, Sep 16, 2024 at 5:57 AM Hari Bathini <hbathini@linux.ibm.com> wrote:
>
> This is v5 of the series posted here:
> https://lore.kernel.org/all/cover.1720942106.git.naveen@kernel.org/
>
> This series reworks core ftrace support on powerpc to have the function
> profiling sequence moved out of line. This enables us to have a single
> nop at kernel function entry virtually eliminating effect of the
> function tracer when it is not enabled. The function profile sequence is
> moved out of line and is allocated at two separate places depending on a
> new config option.
>
> For 64-bit powerpc, the function profiling sequence is also updated to
> include an additional instruction 'mtlr r0' after the usual
> two-instruction sequence to fix link stack imbalance (return address
> predictor) when ftrace is enabled. This showed an improvement of ~10%
> in null_syscall benchmark (NR_LOOPS=10000000) on a Power 10 system
> with ftrace enabled.
>
> Finally, support for ftrace direct calls is added based on support for
> DYNAMIC_FTRACE_WITH_CALL_OPS. BPF Trampoline support is added atop this.
>
> Support for ftrace direct calls is added for 32-bit powerpc. There is
> some code to enable bpf trampolines for 32-bit powerpc, but it is not
> complete and will need to be pursued separately.
>
> Patches 1 to 10 are independent of this series and can go in separately
> though. Rest of the patches depend on the series from Benjamin Gray
> adding support for patch_uint() and patch_ulong():
> https://lore.kernel.org/all/172474280311.31690.1489687786264785049.b4-ty@ellerman.id.au/



It is getting better.

I attached a diff for improvements.



Also, please run 'shellcheck' and eliminate
as many warnings as you can.






$ shellcheck  arch/powerpc/tools/ftrace-gen-ool-stubs.sh

In arch/powerpc/tools/ftrace-gen-ool-stubs.sh line 19:
num_ool_stubs_text=$(${OBJDUMP} -r -j __patchable_function_entries
${vmlinux_o} |

^----------^ SC2086 (info): Double quote to prevent globbing and word
splitting.

Did you mean:
num_ool_stubs_text=$(${OBJDUMP} -r -j __patchable_function_entries
"${vmlinux_o}" |


In arch/powerpc/tools/ftrace-gen-ool-stubs.sh line 20:
     grep -v ".init.text" | grep "${RELOCATION}" | wc -l)
                                            ^------------------^
SC2126 (style): Consider using 'grep -c' instead of 'grep|wc -l'.


In arch/powerpc/tools/ftrace-gen-ool-stubs.sh line 21:
num_ool_stubs_inittext=$(${OBJDUMP} -r -j __patchable_function_entries
${vmlinux_o} |

^----------^ SC2086 (info): Double quote to prevent globbing and word
splitting.

Did you mean:
num_ool_stubs_inittext=$(${OBJDUMP} -r -j __patchable_function_entries
"${vmlinux_o}" |


In arch/powerpc/tools/ftrace-gen-ool-stubs.sh line 22:
grep ".init.text" | grep "${RELOCATION}" | wc -l)
                                             ^------------------^
SC2126 (style): Consider using 'grep -c' instead of 'grep|wc -l'.


In arch/powerpc/tools/ftrace-gen-ool-stubs.sh line 25:
if [ ${num_ool_stubs_text} -gt ${num_ool_stubs_text_builtin} ]; then
     ^-------------------^ SC2086 (info): Double quote to prevent
globbing and word splitting.
                               ^---------------------------^ SC2086
(info): Double quote to prevent globbing and word splitting.

Did you mean:
if [ "${num_ool_stubs_text}" -gt "${num_ool_stubs_text_builtin}" ]; then


In arch/powerpc/tools/ftrace-gen-ool-stubs.sh line 26:
num_ool_stubs_text_end=$(expr ${num_ool_stubs_text} -
${num_ool_stubs_text_builtin})
                                 ^--^ SC2003 (style): expr is
antiquated. Consider rewriting this using $((..)), ${} or [[ ]].
                                      ^-------------------^ SC2086
(info): Double quote to prevent globbing and word splitting.

^---------------------------^ SC2086 (info): Double quote to prevent
globbing and word splitting.

Did you mean:
num_ool_stubs_text_end=$(expr "${num_ool_stubs_text}" -
"${num_ool_stubs_text_builtin}")


In arch/powerpc/tools/ftrace-gen-ool-stubs.sh line 31:
cat > ${arch_vmlinux_S} <<EOF
      ^---------------^ SC2086 (info): Double quote to prevent
globbing and word splitting.

Did you mean:
cat > "${arch_vmlinux_S}" <<EOF

For more information:
  https://www.shellcheck.net/wiki/SC2086 -- Double quote to prevent globbing ...
  https://www.shellcheck.net/wiki/SC2003 -- expr is antiquated. Consider rewr...
  https://www.shellcheck.net/wiki/SC2126 -- Consider using 'grep -c' instead ...











> Changelog v5:
> * Intermediate files named .vmlinux.arch.* instead of .arch.vmlinux.*
> * Fixed ftrace stack tracer failure due to inadvertent use of
>   'add r7, r3, MCOUNT_INSN_SIZE' instruction instead of
>   'addi r7, r3, MCOUNT_INSN_SIZE'
> * Fixed build error for !CONFIG_MODULES case.
> * .vmlinux.arch.* files compiled under arch/powerpc/tools
> * Made sure .vmlinux.arch.* files are cleaned with `make clean`
> * num_ool_stubs_text_end used for setting up ftrace_ool_stub_text_end
>   set to zero instead of computing to some random negative value when
>   not required.
> * Resolved checkpatch.pl warnings.
> * Dropped RFC tag.
>
> Changelog v4:
> - Patches 1, 10 and 13 are new.
> - Address review comments from Nick. Numerous changes throughout the
>   patch series.
> - Extend support for ftrace ool to vmlinux text up to 64MB (patch 13).
> - Address remaining TODOs in support for BPF Trampolines.
> - Update synchronization when patching instructions during trampoline
>   attach/detach.
>
>
> Naveen N Rao (17):
>   powerpc/trace: Account for -fpatchable-function-entry support by
>     toolchain
>   powerpc/kprobes: Use ftrace to determine if a probe is at function
>     entry
>   powerpc64/ftrace: Nop out additional 'std' instruction emitted by gcc
>     v5.x
>   powerpc32/ftrace: Unify 32-bit and 64-bit ftrace entry code
>   powerpc/module_64: Convert #ifdef to IS_ENABLED()
>   powerpc/ftrace: Remove pointer to struct module from dyn_arch_ftrace
>   powerpc/ftrace: Skip instruction patching if the instructions are the
>     same
>   powerpc/ftrace: Move ftrace stub used for init text before _einittext
>   powerpc64/bpf: Fold bpf_jit_emit_func_call_hlp() into
>     bpf_jit_emit_func_call_rel()
>   powerpc/ftrace: Add a postlink script to validate function tracer
>   kbuild: Add generic hook for architectures to use before the final
>     vmlinux link
>   powerpc64/ftrace: Move ftrace sequence out of line
>   powerpc64/ftrace: Support .text larger than 32MB with out-of-line
>     stubs
>   powerpc/ftrace: Add support for DYNAMIC_FTRACE_WITH_CALL_OPS
>   powerpc/ftrace: Add support for DYNAMIC_FTRACE_WITH_DIRECT_CALLS
>   samples/ftrace: Add support for ftrace direct samples on powerpc
>   powerpc64/bpf: Add support for bpf trampolines
>
>  arch/Kconfig                                |   6 +
>  arch/powerpc/Kbuild                         |   2 +-
>  arch/powerpc/Kconfig                        |  23 +-
>  arch/powerpc/Makefile                       |   8 +
>  arch/powerpc/Makefile.postlink              |   8 +
>  arch/powerpc/include/asm/ftrace.h           |  33 +-
>  arch/powerpc/include/asm/module.h           |   5 +
>  arch/powerpc/include/asm/ppc-opcode.h       |  14 +
>  arch/powerpc/kernel/asm-offsets.c           |  11 +
>  arch/powerpc/kernel/kprobes.c               |  18 +-
>  arch/powerpc/kernel/module_64.c             |  66 +-
>  arch/powerpc/kernel/trace/Makefile          |  11 +-
>  arch/powerpc/kernel/trace/ftrace.c          | 298 ++++++-
>  arch/powerpc/kernel/trace/ftrace_64_pg.c    |  69 +-
>  arch/powerpc/kernel/trace/ftrace_entry.S    | 244 ++++--
>  arch/powerpc/kernel/vmlinux.lds.S           |   3 +-
>  arch/powerpc/net/bpf_jit.h                  |  12 +
>  arch/powerpc/net/bpf_jit_comp.c             | 847 +++++++++++++++++++-
>  arch/powerpc/net/bpf_jit_comp32.c           |   7 +-
>  arch/powerpc/net/bpf_jit_comp64.c           |  68 +-
>  arch/powerpc/tools/Makefile                 |  12 +
>  arch/powerpc/tools/ftrace-gen-ool-stubs.sh  |  52 ++
>  arch/powerpc/tools/ftrace_check.sh          |  50 ++
>  samples/ftrace/ftrace-direct-modify.c       |  85 +-
>  samples/ftrace/ftrace-direct-multi-modify.c | 101 ++-
>  samples/ftrace/ftrace-direct-multi.c        |  79 +-
>  samples/ftrace/ftrace-direct-too.c          |  83 +-
>  samples/ftrace/ftrace-direct.c              |  69 +-
>  scripts/Makefile.vmlinux                    |   7 +
>  scripts/link-vmlinux.sh                     |   7 +-
>  30 files changed, 2098 insertions(+), 200 deletions(-)
>  create mode 100644 arch/powerpc/tools/Makefile
>  create mode 100755 arch/powerpc/tools/ftrace-gen-ool-stubs.sh
>  create mode 100755 arch/powerpc/tools/ftrace_check.sh
>
> --
> 2.46.0
>