Message ID | 20210506071041.417854-1-palmer@dabbelt.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | RISC-V: Don't check text_mutex during stop_machine | expand |
Hi Palmer, I love your patch! Yet something to improve: [auto build test ERROR on tip/perf/core] [also build test ERROR on linus/master v5.12 next-20210506] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Palmer-Dabbelt/RISC-V-Don-t-check-text_mutex-during-stop_machine/20210506-151316 base: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git ed8e50800bf4c2d904db9c75408a67085e6cca3d config: riscv-randconfig-c023-20210506 (attached as .config) compiler: riscv32-linux-gcc (GCC) 9.3.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/0day-ci/linux/commit/61fc593f7bfbd5a52e0e5ba3a55f02df81df08d5 git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Palmer-Dabbelt/RISC-V-Don-t-check-text_mutex-during-stop_machine/20210506-151316 git checkout 61fc593f7bfbd5a52e0e5ba3a55f02df81df08d5 # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross W=1 ARCH=riscv If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>): riscv32-linux-ld: arch/riscv/kernel/patch.o: in function `patch_insn_write': >> patch.c:(.text+0x68): undefined reference to `riscv_ftrace_in_stop_machine' >> riscv32-linux-ld: patch.c:(.text+0x6c): undefined reference to `riscv_ftrace_in_stop_machine' --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
On Thu, 6 May 2021 00:10:41 -0700 Palmer Dabbelt <palmer@dabbelt.com> wrote: > diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c > index 7f1e5203de88..da2405652f1d 100644 > --- a/arch/riscv/kernel/ftrace.c > +++ b/arch/riscv/kernel/ftrace.c > @@ -11,6 +11,8 @@ > #include <asm/cacheflush.h> > #include <asm/patch.h> > > +int riscv_ftrace_in_stop_machine; > + > #ifdef CONFIG_DYNAMIC_FTRACE > int ftrace_arch_code_modify_prepare(void) __acquires(&text_mutex) > { > @@ -232,3 +234,16 @@ int ftrace_disable_ftrace_graph_caller(void) > } > #endif /* CONFIG_DYNAMIC_FTRACE */ > #endif /* CONFIG_FUNCTION_GRAPH_TRACER */ > + > +void arch_ftrace_update_code(int command) > +{ > + /* > + * The code sequences we use for ftrace can't be patched while the > + * kernel is running, so we need to use stop_machine() to modify them > + * for now. This doesn't play nice with text_mutex, we use this flag > + * to elide the check. > + */ > + riscv_ftrace_in_stop_machine = true; > + ftrace_run_stop_machine(command); > + riscv_ftrace_in_stop_machine = false; > +} > diff --git a/arch/riscv/kernel/patch.c b/arch/riscv/kernel/patch.c > index 0b552873a577..7983dba477f0 100644 This would work, but my suggestion was to do it without having to add this arch function. Because the caller of this is: static void ftrace_run_update_code(int command) { int ret; ret = ftrace_arch_code_modify_prepare(); FTRACE_WARN_ON(ret); if (ret) return; /* * By default we use stop_machine() to modify the code. * But archs can do what ever they want as long as it * is safe. The stop_machine() is the safest, but also * produces the most overhead. */ arch_ftrace_update_code(command); ret = ftrace_arch_code_modify_post_process(); FTRACE_WARN_ON(ret); } Where you already have two hooks that you use to take the text_mutex before calling arch_ftrace_update_code(). In RISC-V those are: int ftrace_arch_code_modify_prepare(void) __acquires(&text_mutex) { mutex_lock(&text_mutex); return 0; } int ftrace_arch_code_modify_post_process(void) __releases(&text_mutex) { mutex_unlock(&text_mutex); return 0; } Where all you have to do is change them to: int ftrace_arch_code_modify_prepare(void) __acquires(&text_mutex) { mutex_lock(&text_mutex); riscv_ftrace_in_stop_machine = true; return 0; } int ftrace_arch_code_modify_post_process(void) __releases(&text_mutex) { riscv_ftrace_in_stop_machine = false; mutex_unlock(&text_mutex); return 0; } And you have the exact same affect. Those functions are only used before calling the stop machine code you have. -- Steve
On Thu, 6 May 2021 00:10:41 -0700 Palmer Dabbelt <palmer@dabbelt.com> wrote: > --- > In theory we should be able to avoid using stop_machine() with some > clever code sequences, but that's too big of a change to be considered a > fix. I also can't find the text I thought was in the ISA manual about > the allowed behaviors for concurrent modification of the instruction > stream, so I might have just mis-remembered that. > --- I wonder if you could at least use break points, as some other archs do, and what x86 does. If you have this make believe machine code: 00 00 00 00 nop And you want to turn it into a call. aa 12 34 56 bl ftrace_caller And if your architecture has a way to inject a break point on live code. Let's call this FF for the break point code. You can inject that first: FF 00 00 00 sync all CPUs where now all callers will hit this and jump to the break point handler, which simply does: ip = ip + 4; return; and returns back to the instruction after this nop/call. Change the rest of the instruction. FF 12 34 56 sync all CPUs so that they all see this new instruction, but are still triggering the break point. Then finally remove the break point. aa 12 34 56 And you just switched from the nop to the call without using stop machine. -- Steve
On Thu, 06 May 2021 06:25:50 PDT (-0700), rostedt@goodmis.org wrote: > On Thu, 6 May 2021 00:10:41 -0700 > Palmer Dabbelt <palmer@dabbelt.com> wrote: > >> --- >> In theory we should be able to avoid using stop_machine() with some >> clever code sequences, but that's too big of a change to be considered a >> fix. I also can't find the text I thought was in the ISA manual about >> the allowed behaviors for concurrent modification of the instruction >> stream, so I might have just mis-remembered that. >> --- > > I wonder if you could at least use break points, as some other archs do, > and what x86 does. > > If you have this make believe machine code: > > 00 00 00 00 nop > > And you want to turn it into a call. > > aa 12 34 56 bl ftrace_caller > > And if your architecture has a way to inject a break point on live code. > Let's call this FF for the break point code. > > You can inject that first: > > FF 00 00 00 That would rely on this "concurrent writes to the instruction stream don't tear" property that I thought we'd put in the ISA but I can't actually find in the text of the document. That said, I hadn't actually thought about replacing single bytes in the instruction stream. In theory we don't have anything that guartees those won't tear, but I'm not sure how any reasonable implementation would go about manifesting those so it might be an easy retrofit to the ISA. I ran through our breakpoint encodings and don't think those help, but we do have 0x00000013 as a NOP (addi x0, x0, 0) which is one byte away from 0x00000000 (which is defined to be an invalid instruction on all RISC-V systems). There's really no difference between how a breakpoint traps and how an illegal instruction traps on RISC-V, so we should be able to use that to construct some sort of sequence. So we can start with nop nop nop nop nop then do a single-byte write to turn it into unimp nop nop nop nop we can then IPI to all the harts in order to get them on the same page about that trap, which we can then skip over. We'll need some way to differentiate this from accidental executions of unimp, but we can just build up a table somewhere (it wasn't immediately clear how x86 does this). Then we have no ordering restrictions on converting the rest of the stub into what's necessary to trace a function, which should look the same as what we have now unimp save ra, ...(sp) auipc ra, ... jalr ra, ... load ra, ...(sp) At which point we can IPI everywhere to sync the instructions stream again, before we turn off the trap nop save ra, ...(sp) auipc ra, ... jalr ra, ... load ra, ...(sp) > sync all CPUs where now all callers will hit this and jump to the break > point handler, which simply does: > > ip = ip + 4; > return; > > and returns back to the instruction after this nop/call. There'll be some slight headaches there because the length of the all-0 unimplemented instruction depends on which extensions are supported, but we can detect ILEN at runtime so that shouldn't be a showstopper. > Change the rest of the instruction. > > FF 12 34 56 > > sync all CPUs so that they all see this new instruction, but are still > triggering the break point. > > Then finally remove the break point. > > aa 12 34 56 > > And you just switched from the nop to the call without using stop machine. OK, ya I think that all should work. We're still depending on this "single byte writes to the instruction stream don't tear" guarantee, but I think that's a bit pedantic. I'll open a question on the ISA spec just to be sure one I can link here. I'm probably not going to have the time to do this for a bit because I'm buried in this non-coherent stuff, but if someone wants to take a shot at it I'd be happy to take a look. Thanks for pointing this out, this should get is away from stop_machine() way sooner than waiting for a multi-byte modification sequence would!
On Sat, 22 May 2021 12:32:05 -0700 (PDT) Palmer Dabbelt <palmer@dabbelt.com> wrote: > we can then IPI to all the harts in order to get them on the same page > about that trap, which we can then skip over. We'll need some way to > differentiate this from accidental executions of unimp, but we can just > build up a table somewhere (it wasn't immediately clear how x86 does It currently uses the same code as the text_poke does, which does a "batching" and keeps track of the locations that were modified. But before that change (768ae4406a x86/ftrace: Use text_poke()), it had: int ftrace_int3_handler(struct pt_regs *regs) { unsigned long ip; if (WARN_ON_ONCE(!regs)) return 0; ip = regs->ip - INT3_INSN_SIZE; if (ftrace_location(ip)) { int3_emulate_call(regs, (unsigned long)ftrace_regs_caller); return 1; } else if (is_ftrace_caller(ip)) { if (!ftrace_update_func_call) { int3_emulate_jmp(regs, ip + CALL_INSN_SIZE); return 1; } int3_emulate_call(regs, ftrace_update_func_call); return 1; } return 0; } That "ftrace_location()" is the table you are looking for. It will return true if the location is registered with ftrace or not (i.e. the mcount call). The "int3_emulate_jmp()" is needed to handle the case that we switch from one trampoline to another trampoline. But that's also an architecture specific feature, and RISC-V may not have that yet. -- Steve > this). Then we have no ordering restrictions on converting the rest of > the stub into what's necessary to trace a function, which should look > the same as what we have now
diff --git a/arch/riscv/include/asm/ftrace.h b/arch/riscv/include/asm/ftrace.h index 04dad3380041..ee8f07b4dbee 100644 --- a/arch/riscv/include/asm/ftrace.h +++ b/arch/riscv/include/asm/ftrace.h @@ -85,4 +85,8 @@ int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec); #endif +#ifndef __ASSEMBLY__ +extern int riscv_ftrace_in_stop_machine; +#endif + #endif /* _ASM_RISCV_FTRACE_H */ diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c index 7f1e5203de88..da2405652f1d 100644 --- a/arch/riscv/kernel/ftrace.c +++ b/arch/riscv/kernel/ftrace.c @@ -11,6 +11,8 @@ #include <asm/cacheflush.h> #include <asm/patch.h> +int riscv_ftrace_in_stop_machine; + #ifdef CONFIG_DYNAMIC_FTRACE int ftrace_arch_code_modify_prepare(void) __acquires(&text_mutex) { @@ -232,3 +234,16 @@ int ftrace_disable_ftrace_graph_caller(void) } #endif /* CONFIG_DYNAMIC_FTRACE */ #endif /* CONFIG_FUNCTION_GRAPH_TRACER */ + +void arch_ftrace_update_code(int command) +{ + /* + * The code sequences we use for ftrace can't be patched while the + * kernel is running, so we need to use stop_machine() to modify them + * for now. This doesn't play nice with text_mutex, we use this flag + * to elide the check. + */ + riscv_ftrace_in_stop_machine = true; + ftrace_run_stop_machine(command); + riscv_ftrace_in_stop_machine = false; +} diff --git a/arch/riscv/kernel/patch.c b/arch/riscv/kernel/patch.c index 0b552873a577..7983dba477f0 100644 --- a/arch/riscv/kernel/patch.c +++ b/arch/riscv/kernel/patch.c @@ -11,6 +11,7 @@ #include <asm/kprobes.h> #include <asm/cacheflush.h> #include <asm/fixmap.h> +#include <asm/ftrace.h> #include <asm/patch.h> struct patch_insn { @@ -59,8 +60,15 @@ static int patch_insn_write(void *addr, const void *insn, size_t len) * Before reaching here, it was expected to lock the text_mutex * already, so we don't need to give another lock here and could * ensure that it was safe between each cores. + * + * We're currently using stop_machine() for ftrace, and while that + * ensures text_mutex is held before installing the mappings it does + * not ensure text_mutex is held by the calling thread. That's safe + * but triggers a lockdep failure, so just elide it for that specific + * case. */ - lockdep_assert_held(&text_mutex); + if (!riscv_ftrace_in_stop_machine) + lockdep_assert_held(&text_mutex); if (across_pages) patch_map(addr + len, FIX_TEXT_POKE1);