Message ID | a8756482-024c-c858-b3d1-1ffa9a5eb3f7@linux.alibaba.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | ftrace: make sure preemption disabled on recursion testing | expand |
On Tue, 12 Oct 2021 13:40:08 +0800 王贇 <yun.wang@linux.alibaba.com> wrote: > @@ -52,11 +52,6 @@ static void notrace klp_ftrace_handler(unsigned long ip, > bit = ftrace_test_recursion_trylock(ip, parent_ip); > if (WARN_ON_ONCE(bit < 0)) > return; > - /* > - * A variant of synchronize_rcu() is used to allow patching functions > - * where RCU is not watching, see klp_synchronize_transition(). > - */ I have to take a deeper look at this patch set, but do not remove this comment, as it explains the protection here, that is not obvious with the changes you made. -- Steve > - preempt_disable_notrace(); > > func = list_first_or_null_rcu(&ops->func_stack, struct klp_func, > stack_node);
> diff --git a/include/linux/trace_recursion.h b/include/linux/trace_recursion.h > index a9f9c57..805f9c4 100644 > --- a/include/linux/trace_recursion.h > +++ b/include/linux/trace_recursion.h > @@ -214,7 +214,14 @@ static __always_inline void trace_clear_recursion(int bit) > static __always_inline int ftrace_test_recursion_trylock(unsigned long ip, > unsigned long parent_ip) > { > - return trace_test_and_set_recursion(ip, parent_ip, TRACE_FTRACE_START, TRACE_FTRACE_MAX); > + int bit; > + > + preempt_disable_notrace(); > + bit = trace_test_and_set_recursion(ip, parent_ip, TRACE_FTRACE_START, TRACE_FTRACE_MAX); > + if (bit < 0) > + preempt_enable_notrace(); > + > + return bit; > } > > /** > @@ -226,6 +233,7 @@ static __always_inline int ftrace_test_recursion_trylock(unsigned long ip, > static __always_inline void ftrace_test_recursion_unlock(int bit) > { > trace_clear_recursion(bit); > + preempt_enable_notrace(); > } > > #endif /* CONFIG_TRACING */ > diff --git a/kernel/livepatch/patch.c b/kernel/livepatch/patch.c > index e8029ae..6e66ccd 100644 > --- a/kernel/livepatch/patch.c > +++ b/kernel/livepatch/patch.c > @@ -52,11 +52,6 @@ static void notrace klp_ftrace_handler(unsigned long ip, > bit = ftrace_test_recursion_trylock(ip, parent_ip); > if (WARN_ON_ONCE(bit < 0)) > return; > - /* > - * A variant of synchronize_rcu() is used to allow patching functions > - * where RCU is not watching, see klp_synchronize_transition(). > - */ > - preempt_disable_notrace(); > > func = list_first_or_null_rcu(&ops->func_stack, struct klp_func, > stack_node); > @@ -120,7 +115,6 @@ static void notrace klp_ftrace_handler(unsigned long ip, > klp_arch_set_pc(fregs, (unsigned long)func->new_func); > > unlock: > - preempt_enable_notrace(); > ftrace_test_recursion_unlock(bit); > } I don't like this change much. We have preempt_disable there not because of ftrace_test_recursion, but because of RCU. ftrace_test_recursion was added later. Yes, it would work with the change, but it would also hide things which should not be hidden in my opinion. Miroslav
On Tue, 12 Oct 2021 14:24:43 +0200 (CEST) Miroslav Benes <mbenes@suse.cz> wrote: > > +++ b/kernel/livepatch/patch.c > > @@ -52,11 +52,6 @@ static void notrace klp_ftrace_handler(unsigned long ip, > > bit = ftrace_test_recursion_trylock(ip, parent_ip); > > if (WARN_ON_ONCE(bit < 0)) > > return; > > - /* > > - * A variant of synchronize_rcu() is used to allow patching functions > > - * where RCU is not watching, see klp_synchronize_transition(). > > - */ > > - preempt_disable_notrace(); > > > > func = list_first_or_null_rcu(&ops->func_stack, struct klp_func, > > stack_node); > > @@ -120,7 +115,6 @@ static void notrace klp_ftrace_handler(unsigned long ip, > > klp_arch_set_pc(fregs, (unsigned long)func->new_func); > > > > unlock: > > - preempt_enable_notrace(); > > ftrace_test_recursion_unlock(bit); > > } > > I don't like this change much. We have preempt_disable there not because > of ftrace_test_recursion, but because of RCU. ftrace_test_recursion was > added later. Yes, it would work with the change, but it would also hide > things which should not be hidden in my opinion. Agreed, but I believe the change is fine, but requires a nice comment to explain what you said above. Thus, before the "ftrace_test_recursion_trylock()" we need: /* * The ftrace_test_recursion_trylock() will disable preemption, * which is required for the variant of synchronize_rcu() that is * used to allow patching functions where RCU is not watching. * See klp_synchronize_transition() for more details. */ -- Steve
On Tue, 12 Oct 2021 13:40:08 +0800 王贇 <yun.wang@linux.alibaba.com> wrote: > --- a/include/linux/trace_recursion.h > +++ b/include/linux/trace_recursion.h > @@ -214,7 +214,14 @@ static __always_inline void trace_clear_recursion(int bit) > static __always_inline int ftrace_test_recursion_trylock(unsigned long ip, > unsigned long parent_ip) > { > - return trace_test_and_set_recursion(ip, parent_ip, TRACE_FTRACE_START, TRACE_FTRACE_MAX); > + int bit; > + > + preempt_disable_notrace(); The recursion test does not require preemption disabled, it uses the task struct, not per_cpu variables, so you should not disable it before the test. bit = trace_test_and_set_recursion(ip, parent_ip, TRACE_FTRACE_START, TRACE_FTRACE_MAX); if (bit >= 0) preempt_disable_notrace(); And if the bit is zero, it means a recursion check was already done by another caller (ftrace handler does the check, followed by calling perf), and you really don't even need to disable preemption in that case. if (bit > 0) preempt_disable_notrace(); And on the unlock, have: static __always_inline void ftrace_test_recursion_unlock(int bit) { if (bit) preempt_enable_notrace(); trace_clear_recursion(bit); } But maybe that's over optimizing ;-) -- Steve > + bit = trace_test_and_set_recursion(ip, parent_ip, TRACE_FTRACE_START, TRACE_FTRACE_MAX); > + if (bit < 0) > + preempt_enable_notrace(); > + > + return bit; > }
On 2021/10/12 下午8:17, Steven Rostedt wrote: > On Tue, 12 Oct 2021 13:40:08 +0800 > 王贇 <yun.wang@linux.alibaba.com> wrote: > >> @@ -52,11 +52,6 @@ static void notrace klp_ftrace_handler(unsigned long ip, >> bit = ftrace_test_recursion_trylock(ip, parent_ip); >> if (WARN_ON_ONCE(bit < 0)) >> return; >> - /* >> - * A variant of synchronize_rcu() is used to allow patching functions >> - * where RCU is not watching, see klp_synchronize_transition(). >> - */ > > I have to take a deeper look at this patch set, but do not remove this > comment, as it explains the protection here, that is not obvious with the > changes you made. Will keep that in v2. Regards, Michael Wang > > -- Steve > > >> - preempt_disable_notrace(); >> >> func = list_first_or_null_rcu(&ops->func_stack, struct klp_func, >> stack_node);
On 2021/10/12 下午8:24, Miroslav Benes wrote: [snip] >> >> func = list_first_or_null_rcu(&ops->func_stack, struct klp_func, >> stack_node); >> @@ -120,7 +115,6 @@ static void notrace klp_ftrace_handler(unsigned long ip, >> klp_arch_set_pc(fregs, (unsigned long)func->new_func); >> >> unlock: >> - preempt_enable_notrace(); >> ftrace_test_recursion_unlock(bit); >> } > > I don't like this change much. We have preempt_disable there not because > of ftrace_test_recursion, but because of RCU. ftrace_test_recursion was > added later. Yes, it would work with the change, but it would also hide > things which should not be hidden in my opinion. Not very sure about the backgroup stories, but just found this in 'Documentation/trace/ftrace-uses.rst': Note, on success, ftrace_test_recursion_trylock() will disable preemption, and the ftrace_test_recursion_unlock() will enable it again (if it was previously enabled). Seems like this lock pair was supposed to take care the preemtion itself? Regards, Michael Wang > > Miroslav >
On 2021/10/12 下午8:29, Steven Rostedt wrote: > On Tue, 12 Oct 2021 14:24:43 +0200 (CEST) > Miroslav Benes <mbenes@suse.cz> wrote: > >>> +++ b/kernel/livepatch/patch.c >>> @@ -52,11 +52,6 @@ static void notrace klp_ftrace_handler(unsigned long ip, >>> bit = ftrace_test_recursion_trylock(ip, parent_ip); >>> if (WARN_ON_ONCE(bit < 0)) >>> return; >>> - /* >>> - * A variant of synchronize_rcu() is used to allow patching functions >>> - * where RCU is not watching, see klp_synchronize_transition(). >>> - */ >>> - preempt_disable_notrace(); >>> >>> func = list_first_or_null_rcu(&ops->func_stack, struct klp_func, >>> stack_node); >>> @@ -120,7 +115,6 @@ static void notrace klp_ftrace_handler(unsigned long ip, >>> klp_arch_set_pc(fregs, (unsigned long)func->new_func); >>> >>> unlock: >>> - preempt_enable_notrace(); >>> ftrace_test_recursion_unlock(bit); >>> } >> >> I don't like this change much. We have preempt_disable there not because >> of ftrace_test_recursion, but because of RCU. ftrace_test_recursion was >> added later. Yes, it would work with the change, but it would also hide >> things which should not be hidden in my opinion. > > Agreed, but I believe the change is fine, but requires a nice comment to > explain what you said above. > > Thus, before the "ftrace_test_recursion_trylock()" we need: > > /* > * The ftrace_test_recursion_trylock() will disable preemption, > * which is required for the variant of synchronize_rcu() that is > * used to allow patching functions where RCU is not watching. > * See klp_synchronize_transition() for more details. > */ Will be in v2 too :-) Regards, Michael Wang > > -- Steve >
On 2021/10/12 下午8:43, Steven Rostedt wrote: > On Tue, 12 Oct 2021 13:40:08 +0800 > 王贇 <yun.wang@linux.alibaba.com> wrote: > >> --- a/include/linux/trace_recursion.h >> +++ b/include/linux/trace_recursion.h >> @@ -214,7 +214,14 @@ static __always_inline void trace_clear_recursion(int bit) >> static __always_inline int ftrace_test_recursion_trylock(unsigned long ip, >> unsigned long parent_ip) >> { >> - return trace_test_and_set_recursion(ip, parent_ip, TRACE_FTRACE_START, TRACE_FTRACE_MAX); >> + int bit; >> + >> + preempt_disable_notrace(); > > The recursion test does not require preemption disabled, it uses the task > struct, not per_cpu variables, so you should not disable it before the test. > > bit = trace_test_and_set_recursion(ip, parent_ip, TRACE_FTRACE_START, TRACE_FTRACE_MAX); > if (bit >= 0) > preempt_disable_notrace(); > > And if the bit is zero, it means a recursion check was already done by > another caller (ftrace handler does the check, followed by calling perf), > and you really don't even need to disable preemption in that case. > > if (bit > 0) > preempt_disable_notrace(); > > And on the unlock, have: > > static __always_inline void ftrace_test_recursion_unlock(int bit) > { > if (bit) > preempt_enable_notrace(); > trace_clear_recursion(bit); > } > > But maybe that's over optimizing ;-) I see, while the user can still check smp_processor_id() after trylock return bit 0... I guess Peter's point at very beginning is to prevent such cases, since kernel for production will not have preemption debug on, and such issue won't get report but could cause trouble which really hard to trace down , way to eliminate such issue once for all sounds attractive, isn't it? Regards, Michael Wang > > -- Steve > > >> + bit = trace_test_and_set_recursion(ip, parent_ip, TRACE_FTRACE_START, TRACE_FTRACE_MAX); >> + if (bit < 0) >> + preempt_enable_notrace(); >> + >> + return bit; >> }
On Wed, 13 Oct 2021 09:50:17 +0800 王贇 <yun.wang@linux.alibaba.com> wrote: > >> - preempt_enable_notrace(); > >> ftrace_test_recursion_unlock(bit); > >> } > > > > I don't like this change much. We have preempt_disable there not because > > of ftrace_test_recursion, but because of RCU. ftrace_test_recursion was > > added later. Yes, it would work with the change, but it would also hide > > things which should not be hidden in my opinion. > > Not very sure about the backgroup stories, but just found this in > 'Documentation/trace/ftrace-uses.rst': > > Note, on success, > ftrace_test_recursion_trylock() will disable preemption, and the > ftrace_test_recursion_unlock() will enable it again (if it was previously > enabled). Right that part is to be fixed by what you are adding here. The point that Miroslav is complaining about is that the preemption disabling is special in this case, and not just from the recursion point of view, which is why the comment is still required. -- Steve > > Seems like this lock pair was supposed to take care the preemtion itself?
On Wed, 13 Oct 2021 10:04:52 +0800 王贇 <yun.wang@linux.alibaba.com> wrote: > I see, while the user can still check smp_processor_id() after trylock > return bit 0... But preemption would have already been disabled. That's because a bit 0 means that a recursion check has already been made by a previous caller and this one is nested, thus preemption is already disabled. If bit is 0, then preemption had better be disabled as well. -- Steve
On 2021/10/13 上午10:27, Steven Rostedt wrote: > On Wed, 13 Oct 2021 09:50:17 +0800 > 王贇 <yun.wang@linux.alibaba.com> wrote: > >>>> - preempt_enable_notrace(); >>>> ftrace_test_recursion_unlock(bit); >>>> } >>> >>> I don't like this change much. We have preempt_disable there not because >>> of ftrace_test_recursion, but because of RCU. ftrace_test_recursion was >>> added later. Yes, it would work with the change, but it would also hide >>> things which should not be hidden in my opinion. >> >> Not very sure about the backgroup stories, but just found this in >> 'Documentation/trace/ftrace-uses.rst': >> >> Note, on success, >> ftrace_test_recursion_trylock() will disable preemption, and the >> ftrace_test_recursion_unlock() will enable it again (if it was previously >> enabled). > > Right that part is to be fixed by what you are adding here. > > The point that Miroslav is complaining about is that the preemption > disabling is special in this case, and not just from the recursion > point of view, which is why the comment is still required. My bad... the title do confusing people, will rewrite it. Regards, Michael Wang > > -- Steve > > >> >> Seems like this lock pair was supposed to take care the preemtion itself?
On 2021/10/13 上午10:30, Steven Rostedt wrote: > On Wed, 13 Oct 2021 10:04:52 +0800 > 王贇 <yun.wang@linux.alibaba.com> wrote: > >> I see, while the user can still check smp_processor_id() after trylock >> return bit 0... > > But preemption would have already been disabled. That's because a bit 0 > means that a recursion check has already been made by a previous > caller and this one is nested, thus preemption is already disabled. > If bit is 0, then preemption had better be disabled as well. Thanks for the explain, now I get your point :-) Let's make bit 0 an exemption then. Regards, Michael Wang > > -- Steve >
diff --git a/arch/csky/kernel/probes/ftrace.c b/arch/csky/kernel/probes/ftrace.c index ef2bb9b..dff7921 100644 --- a/arch/csky/kernel/probes/ftrace.c +++ b/arch/csky/kernel/probes/ftrace.c @@ -24,7 +24,6 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip, return; regs = ftrace_get_regs(fregs); - preempt_disable_notrace(); p = get_kprobe((kprobe_opcode_t *)ip); if (!p) { p = get_kprobe((kprobe_opcode_t *)(ip - MCOUNT_INSN_SIZE)); @@ -64,7 +63,6 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip, __this_cpu_write(current_kprobe, NULL); } out: - preempt_enable_notrace(); ftrace_test_recursion_unlock(bit); } NOKPROBE_SYMBOL(kprobe_ftrace_handler); diff --git a/arch/parisc/kernel/ftrace.c b/arch/parisc/kernel/ftrace.c index 0a1e75a..3543496 100644 --- a/arch/parisc/kernel/ftrace.c +++ b/arch/parisc/kernel/ftrace.c @@ -216,7 +216,6 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip, return; regs = ftrace_get_regs(fregs); - preempt_disable_notrace(); p = get_kprobe((kprobe_opcode_t *)ip); if (unlikely(!p) || kprobe_disabled(p)) goto out; @@ -245,7 +244,6 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip, } __this_cpu_write(current_kprobe, NULL); out: - preempt_enable_notrace(); ftrace_test_recursion_unlock(bit); } NOKPROBE_SYMBOL(kprobe_ftrace_handler); diff --git a/arch/powerpc/kernel/kprobes-ftrace.c b/arch/powerpc/kernel/kprobes-ftrace.c index 7154d58..072ebe7 100644 --- a/arch/powerpc/kernel/kprobes-ftrace.c +++ b/arch/powerpc/kernel/kprobes-ftrace.c @@ -26,7 +26,6 @@ void kprobe_ftrace_handler(unsigned long nip, unsigned long parent_nip, return; regs = ftrace_get_regs(fregs); - preempt_disable_notrace(); p = get_kprobe((kprobe_opcode_t *)nip); if (unlikely(!p) || kprobe_disabled(p)) goto out; @@ -61,7 +60,6 @@ void kprobe_ftrace_handler(unsigned long nip, unsigned long parent_nip, __this_cpu_write(current_kprobe, NULL); } out: - preempt_enable_notrace(); ftrace_test_recursion_unlock(bit); } NOKPROBE_SYMBOL(kprobe_ftrace_handler); diff --git a/arch/riscv/kernel/probes/ftrace.c b/arch/riscv/kernel/probes/ftrace.c index aab85a8..7142ec4 100644 --- a/arch/riscv/kernel/probes/ftrace.c +++ b/arch/riscv/kernel/probes/ftrace.c @@ -15,7 +15,6 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip, if (bit < 0) return; - preempt_disable_notrace(); p = get_kprobe((kprobe_opcode_t *)ip); if (unlikely(!p) || kprobe_disabled(p)) goto out; @@ -52,7 +51,6 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip, __this_cpu_write(current_kprobe, NULL); } out: - preempt_enable_notrace(); ftrace_test_recursion_unlock(bit); } NOKPROBE_SYMBOL(kprobe_ftrace_handler); diff --git a/arch/x86/kernel/kprobes/ftrace.c b/arch/x86/kernel/kprobes/ftrace.c index 596de2f..dd2ec14 100644 --- a/arch/x86/kernel/kprobes/ftrace.c +++ b/arch/x86/kernel/kprobes/ftrace.c @@ -25,7 +25,6 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip, if (bit < 0) return; - preempt_disable_notrace(); p = get_kprobe((kprobe_opcode_t *)ip); if (unlikely(!p) || kprobe_disabled(p)) goto out; @@ -59,7 +58,6 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip, __this_cpu_write(current_kprobe, NULL); } out: - preempt_enable_notrace(); ftrace_test_recursion_unlock(bit); } NOKPROBE_SYMBOL(kprobe_ftrace_handler); diff --git a/include/linux/trace_recursion.h b/include/linux/trace_recursion.h index a9f9c57..805f9c4 100644 --- a/include/linux/trace_recursion.h +++ b/include/linux/trace_recursion.h @@ -214,7 +214,14 @@ static __always_inline void trace_clear_recursion(int bit) static __always_inline int ftrace_test_recursion_trylock(unsigned long ip, unsigned long parent_ip) { - return trace_test_and_set_recursion(ip, parent_ip, TRACE_FTRACE_START, TRACE_FTRACE_MAX); + int bit; + + preempt_disable_notrace(); + bit = trace_test_and_set_recursion(ip, parent_ip, TRACE_FTRACE_START, TRACE_FTRACE_MAX); + if (bit < 0) + preempt_enable_notrace(); + + return bit; } /** @@ -226,6 +233,7 @@ static __always_inline int ftrace_test_recursion_trylock(unsigned long ip, static __always_inline void ftrace_test_recursion_unlock(int bit) { trace_clear_recursion(bit); + preempt_enable_notrace(); } #endif /* CONFIG_TRACING */ diff --git a/kernel/livepatch/patch.c b/kernel/livepatch/patch.c index e8029ae..6e66ccd 100644 --- a/kernel/livepatch/patch.c +++ b/kernel/livepatch/patch.c @@ -52,11 +52,6 @@ static void notrace klp_ftrace_handler(unsigned long ip, bit = ftrace_test_recursion_trylock(ip, parent_ip); if (WARN_ON_ONCE(bit < 0)) return; - /* - * A variant of synchronize_rcu() is used to allow patching functions - * where RCU is not watching, see klp_synchronize_transition(). - */ - preempt_disable_notrace(); func = list_first_or_null_rcu(&ops->func_stack, struct klp_func, stack_node); @@ -120,7 +115,6 @@ static void notrace klp_ftrace_handler(unsigned long ip, klp_arch_set_pc(fregs, (unsigned long)func->new_func); unlock: - preempt_enable_notrace(); ftrace_test_recursion_unlock(bit); } diff --git a/kernel/trace/trace_functions.c b/kernel/trace/trace_functions.c index 1f0e63f..9f1bfbe 100644 --- a/kernel/trace/trace_functions.c +++ b/kernel/trace/trace_functions.c @@ -186,7 +186,6 @@ static void function_trace_start(struct trace_array *tr) return; trace_ctx = tracing_gen_ctx(); - preempt_disable_notrace(); cpu = smp_processor_id(); data = per_cpu_ptr(tr->array_buffer.data, cpu); @@ -194,7 +193,6 @@ static void function_trace_start(struct trace_array *tr) trace_function(tr, ip, parent_ip, trace_ctx); ftrace_test_recursion_unlock(bit); - preempt_enable_notrace(); } #ifdef CONFIG_UNWINDER_ORC @@ -298,8 +296,6 @@ static inline void process_repeats(struct trace_array *tr, if (bit < 0) return; - preempt_disable_notrace(); - cpu = smp_processor_id(); data = per_cpu_ptr(tr->array_buffer.data, cpu); if (atomic_read(&data->disabled)) @@ -324,7 +320,6 @@ static inline void process_repeats(struct trace_array *tr, out: ftrace_test_recursion_unlock(bit); - preempt_enable_notrace(); } static void
As the documentation explained, ftrace_test_recursion_trylock() and ftrace_test_recursion_unlock() were supposed to disable and enable preemption properly, however currently this work is done outside of the function, which could be missing by mistake. This path will make sure the preemption will be disabled when trylock() succeed, and the unlock() will enable preemption when the the testing of recursion are finished. Reported-by: Abaci <abaci@linux.alibaba.com> Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Michael Wang <yun.wang@linux.alibaba.com> --- arch/csky/kernel/probes/ftrace.c | 2 -- arch/parisc/kernel/ftrace.c | 2 -- arch/powerpc/kernel/kprobes-ftrace.c | 2 -- arch/riscv/kernel/probes/ftrace.c | 2 -- arch/x86/kernel/kprobes/ftrace.c | 2 -- include/linux/trace_recursion.h | 10 +++++++++- kernel/livepatch/patch.c | 6 ------ kernel/trace/trace_functions.c | 5 ----- 8 files changed, 9 insertions(+), 22 deletions(-)