mbox series

[bpf,v2,0/2] rethook: Reject getting a rethook if RCU is not watching

Message ID 165461825202.280167.12903689442217921817.stgit@devnote2 (mailing list archive)
Headers show
Series rethook: Reject getting a rethook if RCU is not watching | expand

Message

Masami Hiramatsu (Google) June 7, 2022, 4:10 p.m. UTC
Hi,

Here is the 2nd version of the patches to reject rethook if RCU is
not watching. The 1st version is here;

https://lore.kernel.org/all/165189881197.175864.14757002789194211860.stgit@devnote2/

This is actually related to the idle function tracing issue
reported by Jiri on LKML (*)

(*) https://lore.kernel.org/bpf/20220515203653.4039075-1-jolsa@kernel.org/

Jiri reported that fprobe (and rethook) based kprobe-multi bpf
trace kicks "suspicious RCU usage" warning. This is because the
RCU operation is used in the kprobe-multi handler. However, I
also found that the similar issue exists in the rethook because
the rethook uses RCU operation.

I added a new patch [1/2] to test this issue by fprobe_example.ko.
(with this patch, it can avoid using printk() which also involves
the RCU operation.)

 ------
 # insmod fprobe_example.ko symbol=arch_cpu_idle use_trace=1 stackdump=0 
 fprobe_init: Planted fprobe at arch_cpu_idle
 # rmmod fprobe_example.ko 
 
 =============================
 WARNING: suspicious RCU usage
 5.18.0-rc5-00019-gcae4ec21e87a-dirty #30 Not tainted
 -----------------------------
 include/trace/events/lock.h:37 suspicious rcu_dereference_check() usage!
 
 other info that might help us debug this:
 
 rcu_scheduler_active = 2, debug_locks = 1
 
 
 RCU used illegally from extended quiescent state!
 no locks held by swapper/0/0.
 
 stack backtrace:
 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.18.0-rc5-00019-gcae4ec21e87a-dirty #30
 ------
 
After applying [2/2] fix (which avoid initializing rethook on
function entry if !rcu_watching()), this warning was gone.

 ------
 # insmod fprobe_example.ko symbol=arch_cpu_idle use_trace=1 stackdump=0
 fprobe_init: Planted fprobe at arch_cpu_idle
 # rmmod fprobe_example.ko 
 fprobe_exit: fprobe at arch_cpu_idle unregistered. 225 times hit, 230 times missed
 ------

Note that you can test this program until the arch_cpu_idle()
is marked as noinstr. After that, the function can not be
traced.

Thank you,

---

Masami Hiramatsu (Google) (2):
      fprobe: samples: Add use_trace option and show hit/missed counter
      rethook: Reject getting a rethook if RCU is not watching


 kernel/trace/rethook.c          |    9 +++++++++
 samples/fprobe/fprobe_example.c |   21 +++++++++++++++++----
 2 files changed, 26 insertions(+), 4 deletions(-)

--
Signature

Comments

Jiri Olsa June 17, 2022, 11:27 a.m. UTC | #1
On Wed, Jun 08, 2022 at 01:10:52AM +0900, Masami Hiramatsu (Google) wrote:
> Hi,
> 
> Here is the 2nd version of the patches to reject rethook if RCU is
> not watching. The 1st version is here;
> 
> https://lore.kernel.org/all/165189881197.175864.14757002789194211860.stgit@devnote2/
> 
> This is actually related to the idle function tracing issue
> reported by Jiri on LKML (*)
> 
> (*) https://lore.kernel.org/bpf/20220515203653.4039075-1-jolsa@kernel.org/
> 
> Jiri reported that fprobe (and rethook) based kprobe-multi bpf
> trace kicks "suspicious RCU usage" warning. This is because the
> RCU operation is used in the kprobe-multi handler. However, I
> also found that the similar issue exists in the rethook because
> the rethook uses RCU operation.
> 
> I added a new patch [1/2] to test this issue by fprobe_example.ko.
> (with this patch, it can avoid using printk() which also involves
> the RCU operation.)
> 
>  ------
>  # insmod fprobe_example.ko symbol=arch_cpu_idle use_trace=1 stackdump=0 
>  fprobe_init: Planted fprobe at arch_cpu_idle
>  # rmmod fprobe_example.ko 
>  
>  =============================
>  WARNING: suspicious RCU usage
>  5.18.0-rc5-00019-gcae4ec21e87a-dirty #30 Not tainted
>  -----------------------------
>  include/trace/events/lock.h:37 suspicious rcu_dereference_check() usage!
>  
>  other info that might help us debug this:
>  
>  rcu_scheduler_active = 2, debug_locks = 1
>  
>  
>  RCU used illegally from extended quiescent state!
>  no locks held by swapper/0/0.
>  
>  stack backtrace:
>  CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.18.0-rc5-00019-gcae4ec21e87a-dirty #30
>  ------
>  
> After applying [2/2] fix (which avoid initializing rethook on
> function entry if !rcu_watching()), this warning was gone.
> 
>  ------
>  # insmod fprobe_example.ko symbol=arch_cpu_idle use_trace=1 stackdump=0
>  fprobe_init: Planted fprobe at arch_cpu_idle
>  # rmmod fprobe_example.ko 
>  fprobe_exit: fprobe at arch_cpu_idle unregistered. 225 times hit, 230 times missed
>  ------
> 
> Note that you can test this program until the arch_cpu_idle()
> is marked as noinstr. After that, the function can not be
> traced.
> 
> Thank you,
> 
> ---
> 
> Masami Hiramatsu (Google) (2):
>       fprobe: samples: Add use_trace option and show hit/missed counter
>       rethook: Reject getting a rethook if RCU is not watching

LGTM

Acked-by: Jiri Olsa <jolsa@kernel.org>

jirka

> 
> 
>  kernel/trace/rethook.c          |    9 +++++++++
>  samples/fprobe/fprobe_example.c |   21 +++++++++++++++++----
>  2 files changed, 26 insertions(+), 4 deletions(-)
> 
> --
> Signature