mbox series

[riscv/for-next,0/4] Enable ftrace with kernel preemption for RISC-V

Message ID 20220609071833.1051941-1-andy.chiu@sifive.com (mailing list archive)
Headers show
Series Enable ftrace with kernel preemption for RISC-V | expand

Message

Andy Chiu June 9, 2022, 7:18 a.m. UTC
This patch remove dependency of dynamic ftrace from calling
stop_machine(), and make it compatiable with kernel preemption.
Originally, we ran into stack corruptions, or execution of partially
updated instructions when starting or stopping ftrace on a fully
preemptible kernel configuration. The reason is that kernel periodically
calls rcu_momentary_dyntick_idle() on cores waiting for the code-patching
core running in ftrace. Though rcu_momentary_dyntick_idle() itself is
marked as notrace, it would call a bunch of tracable functions if we
configured the kernel as preemptible. For example, these are some functions
that happened to have a symbol and have not been marked as notrace on a
RISC-V preemptible kernel compiled with GCC-11:
 - __rcu_report_exp_rnp()
 - rcu_report_exp_cpu_mult()
 - rcu_preempt_deferred_qs()
 - rcu_preempt_need_deferred_qs()
 - rcu_preempt_deferred_qs_irqrestore()

Thus, this make it not ideal for us to rely on stop_machine() and
handly marked "notrace"s to perform runtime code patching. To remove
such dependency, we must make updates of code seemed atomic on running
cores. This might not be obvious for RISC-V since it usaually uses a pair
of AUIPC + JALR to perform a long jump, which cannot be modified and
executed concurrently if we consider preemptions. As such, this patch
proposed a way to make it possible. It embeds a 32-bit rel-address data
into instructions of each ftrace prologue and jumps indirectly. In this
way, we could store and load the address atomically so that the code
patching core could run simutaneously with the rest of running cores.

After applying the patchset, we compiled a preemptible kernel with all
tracers and ftrace-selftest enabled, and booted it on a 2-core QEMU virt
machine. The kernel could boot up successfully, passing all ftrace
testsuits. Besides, we ran a script that randomly pick a tracer on every
0~5 seconds. The kernel has sustained over 20K rounds of the test. In
contrast, a preemptible kernel without our patch would panic in few
rounds on the same machine.

However, we did run into errors when using hwlat or irqsoff tracers
together with cpu-online stressor from stress-ng on a preemptible kernel.
The stressor would randomly take a cpu from 1-N offline and online using
hotplug mechanism. Nonotheless, since we think there is little chance that
it is related to our changes in ftrace, we are working on it in parallel,
and may come up with another patch soon.

Andy Chiu (4):
  riscv: align ftrace to 4 Byte boundary and increase ftrace prologue
    size
  riscv: export patch_insn_write
  riscv: ftrace: use indirect jump to work with kernel preemption
  riscv: ftrace: do not use stop_machine to update code

 arch/riscv/Makefile             |   2 +-
 arch/riscv/include/asm/ftrace.h |  24 ------
 arch/riscv/include/asm/patch.h  |   1 +
 arch/riscv/kernel/ftrace.c      | 135 ++++++++++++++++++++------------
 arch/riscv/kernel/mcount-dyn.S  |  62 +++++++++++----
 arch/riscv/kernel/patch.c       |   4 +-
 6 files changed, 136 insertions(+), 92 deletions(-)

Comments

Andy Chiu June 16, 2022, 8:45 a.m. UTC | #1
Loop in: Guo Ren


On Thu, Jun 9, 2022 at 3:21 PM Andy Chiu <andy.chiu@sifive.com> wrote:
>
> This patch remove dependency of dynamic ftrace from calling
> stop_machine(), and make it compatiable with kernel preemption.
> Originally, we ran into stack corruptions, or execution of partially
> updated instructions when starting or stopping ftrace on a fully
> preemptible kernel configuration. The reason is that kernel periodically
> calls rcu_momentary_dyntick_idle() on cores waiting for the code-patching
> core running in ftrace. Though rcu_momentary_dyntick_idle() itself is
> marked as notrace, it would call a bunch of tracable functions if we
> configured the kernel as preemptible. For example, these are some functions
> that happened to have a symbol and have not been marked as notrace on a
> RISC-V preemptible kernel compiled with GCC-11:
>  - __rcu_report_exp_rnp()
>  - rcu_report_exp_cpu_mult()
>  - rcu_preempt_deferred_qs()
>  - rcu_preempt_need_deferred_qs()
>  - rcu_preempt_deferred_qs_irqrestore()
>
> Thus, this make it not ideal for us to rely on stop_machine() and
> handly marked "notrace"s to perform runtime code patching. To remove
> such dependency, we must make updates of code seemed atomic on running
> cores. This might not be obvious for RISC-V since it usaually uses a pair
> of AUIPC + JALR to perform a long jump, which cannot be modified and
> executed concurrently if we consider preemptions. As such, this patch
> proposed a way to make it possible. It embeds a 32-bit rel-address data
> into instructions of each ftrace prologue and jumps indirectly. In this
> way, we could store and load the address atomically so that the code
> patching core could run simutaneously with the rest of running cores.
>
> After applying the patchset, we compiled a preemptible kernel with all
> tracers and ftrace-selftest enabled, and booted it on a 2-core QEMU virt
> machine. The kernel could boot up successfully, passing all ftrace
> testsuits. Besides, we ran a script that randomly pick a tracer on every
> 0~5 seconds. The kernel has sustained over 20K rounds of the test. In
> contrast, a preemptible kernel without our patch would panic in few
> rounds on the same machine.
>
> However, we did run into errors when using hwlat or irqsoff tracers
> together with cpu-online stressor from stress-ng on a preemptible kernel.
> The stressor would randomly take a cpu from 1-N offline and online using
> hotplug mechanism. Nonotheless, since we think there is little chance that
> it is related to our changes in ftrace, we are working on it in parallel,
> and may come up with another patch soon.
>
> Andy Chiu (4):
>   riscv: align ftrace to 4 Byte boundary and increase ftrace prologue
>     size
>   riscv: export patch_insn_write
>   riscv: ftrace: use indirect jump to work with kernel preemption
>   riscv: ftrace: do not use stop_machine to update code
>
>  arch/riscv/Makefile             |   2 +-
>  arch/riscv/include/asm/ftrace.h |  24 ------
>  arch/riscv/include/asm/patch.h  |   1 +
>  arch/riscv/kernel/ftrace.c      | 135 ++++++++++++++++++++------------
>  arch/riscv/kernel/mcount-dyn.S  |  62 +++++++++++----
>  arch/riscv/kernel/patch.c       |   4 +-
>  6 files changed, 136 insertions(+), 92 deletions(-)
>
> --
> 2.36.0
>
Andy Chiu July 8, 2022, 7:56 a.m. UTC | #2
This is just a gentle ping. If there are concerns or problems
regarding this patch, please let me know. Thanks!

> On Thu, Jun 9, 2022 at 3:21 PM Andy Chiu <andy.chiu@sifive.com> wrote:
> >
> > This patch remove dependency of dynamic ftrace from calling
> > stop_machine(), and make it compatiable with kernel preemption.
> > Originally, we ran into stack corruptions, or execution of partially
> > updated instructions when starting or stopping ftrace on a fully
> > preemptible kernel configuration. The reason is that kernel periodically
> > calls rcu_momentary_dyntick_idle() on cores waiting for the code-patching
> > core running in ftrace. Though rcu_momentary_dyntick_idle() itself is
> > marked as notrace, it would call a bunch of tracable functions if we
> > configured the kernel as preemptible. For example, these are some functions
> > that happened to have a symbol and have not been marked as notrace on a
> > RISC-V preemptible kernel compiled with GCC-11:
> >  - __rcu_report_exp_rnp()
> >  - rcu_report_exp_cpu_mult()
> >  - rcu_preempt_deferred_qs()
> >  - rcu_preempt_need_deferred_qs()
> >  - rcu_preempt_deferred_qs_irqrestore()
> >
> > Thus, this make it not ideal for us to rely on stop_machine() and
> > handly marked "notrace"s to perform runtime code patching. To remove
> > such dependency, we must make updates of code seemed atomic on running
> > cores. This might not be obvious for RISC-V since it usaually uses a pair
> > of AUIPC + JALR to perform a long jump, which cannot be modified and
> > executed concurrently if we consider preemptions. As such, this patch
> > proposed a way to make it possible. It embeds a 32-bit rel-address data
> > into instructions of each ftrace prologue and jumps indirectly. In this
> > way, we could store and load the address atomically so that the code
> > patching core could run simutaneously with the rest of running cores.
> >
> > After applying the patchset, we compiled a preemptible kernel with all
> > tracers and ftrace-selftest enabled, and booted it on a 2-core QEMU virt
> > machine. The kernel could boot up successfully, passing all ftrace
> > testsuits. Besides, we ran a script that randomly pick a tracer on every
> > 0~5 seconds. The kernel has sustained over 20K rounds of the test. In
> > contrast, a preemptible kernel without our patch would panic in few
> > rounds on the same machine.
> >
> > However, we did run into errors when using hwlat or irqsoff tracers
> > together with cpu-online stressor from stress-ng on a preemptible kernel.
> > The stressor would randomly take a cpu from 1-N offline and online using
> > hotplug mechanism. Nonotheless, since we think there is little chance that
> > it is related to our changes in ftrace, we are working on it in parallel,
> > and may come up with another patch soon.
> >
> > Andy Chiu (4):
> >   riscv: align ftrace to 4 Byte boundary and increase ftrace prologue
> >     size
> >   riscv: export patch_insn_write
> >   riscv: ftrace: use indirect jump to work with kernel preemption
> >   riscv: ftrace: do not use stop_machine to update code
> >
> >  arch/riscv/Makefile             |   2 +-
> >  arch/riscv/include/asm/ftrace.h |  24 ------
> >  arch/riscv/include/asm/patch.h  |   1 +
> >  arch/riscv/kernel/ftrace.c      | 135 ++++++++++++++++++++------------
> >  arch/riscv/kernel/mcount-dyn.S  |  62 +++++++++++----
> >  arch/riscv/kernel/patch.c       |   4 +-
> >  6 files changed, 136 insertions(+), 92 deletions(-)
> >
> > --
> > 2.36.0
> >

Regards,
Andy