diff mbox series

[bpf-next,1/2] cpuidle/rcu: Making arch_cpu_idle and rcu_idle_exit noinstr

Message ID 20220515203653.4039075-1-jolsa@kernel.org (mailing list archive)
State Changes Requested
Delegated to: BPF
Headers show
Series [bpf-next,1/2] cpuidle/rcu: Making arch_cpu_idle and rcu_idle_exit noinstr | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for bpf-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix success Link
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 24 this patch: 24
netdev/cc_maintainers warning 16 maintainers not CCed: rcu@vger.kernel.org peterz@infradead.org josh@joshtriplett.org bp@alien8.de hpa@zytor.com joel@joelfernandes.org jiangshanlai@gmail.com kpsingh@kernel.org quic_neeraju@quicinc.com mingo@redhat.com x86@kernel.org zhengqi.arch@bytedance.com frederic@kernel.org tglx@linutronix.de dave.hansen@linux.intel.com mathieu.desnoyers@efficios.com
netdev/build_clang fail Errors and warnings before: 12 this patch: 14
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn fail Errors and warnings before: 25 this patch: 27
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 16 lines checked
netdev/kdoc success Errors and warnings before: 1 this patch: 1
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-VM_Test-2 success Logs for Kernel LATEST on ubuntu-latest with llvm-15
bpf/vmtest-bpf-next-VM_Test-1 success Logs for Kernel LATEST on ubuntu-latest with gcc
bpf/vmtest-bpf-next-VM_Test-3 success Logs for Kernel LATEST on z15 with gcc
bpf/vmtest-bpf-next-PR fail PR summary

Commit Message

Jiri Olsa May 15, 2022, 8:36 p.m. UTC
Making arch_cpu_idle and rcu_idle_exit noinstr. Both functions run
in rcu 'not watching' context and if there's tracer attached to
them, which uses rcu (e.g. kprobe multi interface) it will hit RCU
warning like:

  [    3.017540] WARNING: suspicious RCU usage
  ...
  [    3.018363]  kprobe_multi_link_handler+0x68/0x1c0
  [    3.018364]  ? kprobe_multi_link_handler+0x3e/0x1c0
  [    3.018366]  ? arch_cpu_idle_dead+0x10/0x10
  [    3.018367]  ? arch_cpu_idle_dead+0x10/0x10
  [    3.018371]  fprobe_handler.part.0+0xab/0x150
  [    3.018374]  0xffffffffa00080c8
  [    3.018393]  ? arch_cpu_idle+0x5/0x10
  [    3.018398]  arch_cpu_idle+0x5/0x10
  [    3.018399]  default_idle_call+0x59/0x90
  [    3.018401]  do_idle+0x1c3/0x1d0

The call path is following:

default_idle_call
  rcu_idle_enter
  arch_cpu_idle
  rcu_idle_exit

The arch_cpu_idle and rcu_idle_exit are the only ones from above
path that are traceble and cause this problem on my setup.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 arch/x86/kernel/process.c | 2 +-
 kernel/rcu/tree.c         | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

Comments

Paul E. McKenney May 16, 2022, 4:25 a.m. UTC | #1
On Sun, May 15, 2022 at 10:36:52PM +0200, Jiri Olsa wrote:
> Making arch_cpu_idle and rcu_idle_exit noinstr. Both functions run
> in rcu 'not watching' context and if there's tracer attached to
> them, which uses rcu (e.g. kprobe multi interface) it will hit RCU
> warning like:
> 
>   [    3.017540] WARNING: suspicious RCU usage
>   ...
>   [    3.018363]  kprobe_multi_link_handler+0x68/0x1c0
>   [    3.018364]  ? kprobe_multi_link_handler+0x3e/0x1c0
>   [    3.018366]  ? arch_cpu_idle_dead+0x10/0x10
>   [    3.018367]  ? arch_cpu_idle_dead+0x10/0x10
>   [    3.018371]  fprobe_handler.part.0+0xab/0x150
>   [    3.018374]  0xffffffffa00080c8
>   [    3.018393]  ? arch_cpu_idle+0x5/0x10
>   [    3.018398]  arch_cpu_idle+0x5/0x10
>   [    3.018399]  default_idle_call+0x59/0x90
>   [    3.018401]  do_idle+0x1c3/0x1d0
> 
> The call path is following:
> 
> default_idle_call
>   rcu_idle_enter
>   arch_cpu_idle
>   rcu_idle_exit
> 
> The arch_cpu_idle and rcu_idle_exit are the only ones from above
> path that are traceble and cause this problem on my setup.
> 
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>

From an RCU viewpoint:

Reviewed-by: Paul E. McKenney <paulmck@kernel.org>

[ I considered asking for an instrumentation_on() in rcu_idle_exit(),
but there is no point given that local_irq_restore() isn't something
you instrument anyway. ]

> ---
>  arch/x86/kernel/process.c | 2 +-
>  kernel/rcu/tree.c         | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
> index b370767f5b19..1345cb0124a6 100644
> --- a/arch/x86/kernel/process.c
> +++ b/arch/x86/kernel/process.c
> @@ -720,7 +720,7 @@ void arch_cpu_idle_dead(void)
>  /*
>   * Called from the generic idle code.
>   */
> -void arch_cpu_idle(void)
> +void noinstr arch_cpu_idle(void)
>  {
>  	x86_idle();
>  }
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index a4b8189455d5..20d529722f51 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -896,7 +896,7 @@ static void noinstr rcu_eqs_exit(bool user)
>   * If you add or remove a call to rcu_idle_exit(), be sure to test with
>   * CONFIG_RCU_EQS_DEBUG=y.
>   */
> -void rcu_idle_exit(void)
> +void noinstr rcu_idle_exit(void)
>  {
>  	unsigned long flags;
>  
> -- 
> 2.35.3
>
Frederic Weisbecker May 16, 2022, 11:49 a.m. UTC | #2
On Sun, May 15, 2022 at 09:25:35PM -0700, Paul E. McKenney wrote:
> On Sun, May 15, 2022 at 10:36:52PM +0200, Jiri Olsa wrote:
> > Making arch_cpu_idle and rcu_idle_exit noinstr. Both functions run
> > in rcu 'not watching' context and if there's tracer attached to
> > them, which uses rcu (e.g. kprobe multi interface) it will hit RCU
> > warning like:
> > 
> >   [    3.017540] WARNING: suspicious RCU usage
> >   ...
> >   [    3.018363]  kprobe_multi_link_handler+0x68/0x1c0
> >   [    3.018364]  ? kprobe_multi_link_handler+0x3e/0x1c0
> >   [    3.018366]  ? arch_cpu_idle_dead+0x10/0x10
> >   [    3.018367]  ? arch_cpu_idle_dead+0x10/0x10
> >   [    3.018371]  fprobe_handler.part.0+0xab/0x150
> >   [    3.018374]  0xffffffffa00080c8
> >   [    3.018393]  ? arch_cpu_idle+0x5/0x10
> >   [    3.018398]  arch_cpu_idle+0x5/0x10
> >   [    3.018399]  default_idle_call+0x59/0x90
> >   [    3.018401]  do_idle+0x1c3/0x1d0
> > 
> > The call path is following:
> > 
> > default_idle_call
> >   rcu_idle_enter
> >   arch_cpu_idle
> >   rcu_idle_exit
> > 
> > The arch_cpu_idle and rcu_idle_exit are the only ones from above
> > path that are traceble and cause this problem on my setup.
> > 
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> 
> From an RCU viewpoint:
> 
> Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
> 
> [ I considered asking for an instrumentation_on() in rcu_idle_exit(),
> but there is no point given that local_irq_restore() isn't something
> you instrument anyway. ]

So local_irq_save() in the beginning of rcu_idle_exit() is unsafe because
it is instrumentable by the function (graph)  tracers and the irqsoff tracer.

Also it calls into lockdep that might make use of RCU.

That's why rcu_idle_exit() is not noinstr yet. See this patch:

https://lore.kernel.org/lkml/20220503100051.2799723-4-frederic@kernel.org/

Thanks.

> 
> > ---
> >  arch/x86/kernel/process.c | 2 +-
> >  kernel/rcu/tree.c         | 2 +-
> >  2 files changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
> > index b370767f5b19..1345cb0124a6 100644
> > --- a/arch/x86/kernel/process.c
> > +++ b/arch/x86/kernel/process.c
> > @@ -720,7 +720,7 @@ void arch_cpu_idle_dead(void)
> >  /*
> >   * Called from the generic idle code.
> >   */
> > -void arch_cpu_idle(void)
> > +void noinstr arch_cpu_idle(void)
> >  {
> >  	x86_idle();
> >  }
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index a4b8189455d5..20d529722f51 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -896,7 +896,7 @@ static void noinstr rcu_eqs_exit(bool user)
> >   * If you add or remove a call to rcu_idle_exit(), be sure to test with
> >   * CONFIG_RCU_EQS_DEBUG=y.
> >   */
> > -void rcu_idle_exit(void)
> > +void noinstr rcu_idle_exit(void)
> >  {
> >  	unsigned long flags;
> >  
> > -- 
> > 2.35.3
> >
Paul E. McKenney May 16, 2022, 1:39 p.m. UTC | #3
On Mon, May 16, 2022 at 01:49:22PM +0200, Frederic Weisbecker wrote:
> On Sun, May 15, 2022 at 09:25:35PM -0700, Paul E. McKenney wrote:
> > On Sun, May 15, 2022 at 10:36:52PM +0200, Jiri Olsa wrote:
> > > Making arch_cpu_idle and rcu_idle_exit noinstr. Both functions run
> > > in rcu 'not watching' context and if there's tracer attached to
> > > them, which uses rcu (e.g. kprobe multi interface) it will hit RCU
> > > warning like:
> > > 
> > >   [    3.017540] WARNING: suspicious RCU usage
> > >   ...
> > >   [    3.018363]  kprobe_multi_link_handler+0x68/0x1c0
> > >   [    3.018364]  ? kprobe_multi_link_handler+0x3e/0x1c0
> > >   [    3.018366]  ? arch_cpu_idle_dead+0x10/0x10
> > >   [    3.018367]  ? arch_cpu_idle_dead+0x10/0x10
> > >   [    3.018371]  fprobe_handler.part.0+0xab/0x150
> > >   [    3.018374]  0xffffffffa00080c8
> > >   [    3.018393]  ? arch_cpu_idle+0x5/0x10
> > >   [    3.018398]  arch_cpu_idle+0x5/0x10
> > >   [    3.018399]  default_idle_call+0x59/0x90
> > >   [    3.018401]  do_idle+0x1c3/0x1d0
> > > 
> > > The call path is following:
> > > 
> > > default_idle_call
> > >   rcu_idle_enter
> > >   arch_cpu_idle
> > >   rcu_idle_exit
> > > 
> > > The arch_cpu_idle and rcu_idle_exit are the only ones from above
> > > path that are traceble and cause this problem on my setup.
> > > 
> > > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > 
> > From an RCU viewpoint:
> > 
> > Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
> > 
> > [ I considered asking for an instrumentation_on() in rcu_idle_exit(),
> > but there is no point given that local_irq_restore() isn't something
> > you instrument anyway. ]
> 
> So local_irq_save() in the beginning of rcu_idle_exit() is unsafe because
> it is instrumentable by the function (graph)  tracers and the irqsoff tracer.
> 
> Also it calls into lockdep that might make use of RCU.
> 
> That's why rcu_idle_exit() is not noinstr yet. See this patch:
> 
> https://lore.kernel.org/lkml/20220503100051.2799723-4-frederic@kernel.org/

Ah, I should have looked at the context-tracking series again!

And I have to ask...  How much debugging capability are we really losing
by continuing to use the raw versions of local_irq_{save,restore}()?

							Thanx, Paul

> Thanks.
> 
> > 
> > > ---
> > >  arch/x86/kernel/process.c | 2 +-
> > >  kernel/rcu/tree.c         | 2 +-
> > >  2 files changed, 2 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
> > > index b370767f5b19..1345cb0124a6 100644
> > > --- a/arch/x86/kernel/process.c
> > > +++ b/arch/x86/kernel/process.c
> > > @@ -720,7 +720,7 @@ void arch_cpu_idle_dead(void)
> > >  /*
> > >   * Called from the generic idle code.
> > >   */
> > > -void arch_cpu_idle(void)
> > > +void noinstr arch_cpu_idle(void)
> > >  {
> > >  	x86_idle();
> > >  }
> > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > index a4b8189455d5..20d529722f51 100644
> > > --- a/kernel/rcu/tree.c
> > > +++ b/kernel/rcu/tree.c
> > > @@ -896,7 +896,7 @@ static void noinstr rcu_eqs_exit(bool user)
> > >   * If you add or remove a call to rcu_idle_exit(), be sure to test with
> > >   * CONFIG_RCU_EQS_DEBUG=y.
> > >   */
> > > -void rcu_idle_exit(void)
> > > +void noinstr rcu_idle_exit(void)
> > >  {
> > >  	unsigned long flags;
> > >  
> > > -- 
> > > 2.35.3
> > >
kernel test robot May 17, 2022, 2:39 a.m. UTC | #4
Hi Jiri,

I love your patch! Perhaps something to improve:

[auto build test WARNING on bpf-next/master]

url:    https://github.com/intel-lab-lkp/linux/commits/Jiri-Olsa/cpuidle-rcu-Making-arch_cpu_idle-and-rcu_idle_exit-noinstr/20220516-043752
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
config: x86_64-randconfig-c002-20220516 (https://download.01.org/0day-ci/archive/20220517/202205171050.9msZot6F-lkp@intel.com/config)
compiler: gcc-11 (Debian 11.2.0-20) 11.2.0
reproduce (this is a W=1 build):
        # https://github.com/intel-lab-lkp/linux/commit/0b6fee32d730f621f2bfc4d8d9f0729814398415
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Jiri-Olsa/cpuidle-rcu-Making-arch_cpu_idle-and-rcu_idle_exit-noinstr/20220516-043752
        git checkout 0b6fee32d730f621f2bfc4d8d9f0729814398415
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        make W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> vmlinux.o: warning: objtool: arch_cpu_idle()+0x0: call to {dynamic}() leaves .noinstr.text section
>> vmlinux.o: warning: objtool: rcu_idle_exit()+0x16: call to trace_hardirqs_off() leaves .noinstr.text section
   vmlinux.o: warning: objtool: enter_from_user_mode()+0x18: call to __kcsan_check_access() leaves .noinstr.text section
   vmlinux.o: warning: objtool: syscall_enter_from_user_mode()+0x1d: call to __kcsan_check_access() leaves .noinstr.text section
   vmlinux.o: warning: objtool: syscall_enter_from_user_mode_prepare()+0x18: call to __kcsan_check_access() leaves .noinstr.text section
   vmlinux.o: warning: objtool: irqentry_enter_from_user_mode()+0x18: call to __kcsan_check_access() leaves .noinstr.text section
Yonghong Song May 17, 2022, 2:54 a.m. UTC | #5
On 5/15/22 1:36 PM, Jiri Olsa wrote:
> Making arch_cpu_idle and rcu_idle_exit noinstr. Both functions run
> in rcu 'not watching' context and if there's tracer attached to
> them, which uses rcu (e.g. kprobe multi interface) it will hit RCU
> warning like:
> 
>    [    3.017540] WARNING: suspicious RCU usage
>    ...
>    [    3.018363]  kprobe_multi_link_handler+0x68/0x1c0
>    [    3.018364]  ? kprobe_multi_link_handler+0x3e/0x1c0
>    [    3.018366]  ? arch_cpu_idle_dead+0x10/0x10
>    [    3.018367]  ? arch_cpu_idle_dead+0x10/0x10
>    [    3.018371]  fprobe_handler.part.0+0xab/0x150
>    [    3.018374]  0xffffffffa00080c8
>    [    3.018393]  ? arch_cpu_idle+0x5/0x10
>    [    3.018398]  arch_cpu_idle+0x5/0x10
>    [    3.018399]  default_idle_call+0x59/0x90
>    [    3.018401]  do_idle+0x1c3/0x1d0
> 
> The call path is following:
> 
> default_idle_call
>    rcu_idle_enter
>    arch_cpu_idle
>    rcu_idle_exit
> 
> The arch_cpu_idle and rcu_idle_exit are the only ones from above
> path that are traceble and cause this problem on my setup.
> 
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>   arch/x86/kernel/process.c | 2 +-
>   kernel/rcu/tree.c         | 2 +-
>   2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
> index b370767f5b19..1345cb0124a6 100644
> --- a/arch/x86/kernel/process.c
> +++ b/arch/x86/kernel/process.c
> @@ -720,7 +720,7 @@ void arch_cpu_idle_dead(void)
>   /*
>    * Called from the generic idle code.
>    */
> -void arch_cpu_idle(void)
> +void noinstr arch_cpu_idle(void)

noinstr includes a lot of attributes:

#define noinstr                                                         \
         noinline notrace __attribute((__section__(".noinstr.text")))    \
         __no_kcsan __no_sanitize_address __no_profile 
__no_sanitize_coverage

should we use notrace here?

>   {
>   	x86_idle();
>   }
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index a4b8189455d5..20d529722f51 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -896,7 +896,7 @@ static void noinstr rcu_eqs_exit(bool user)
>    * If you add or remove a call to rcu_idle_exit(), be sure to test with
>    * CONFIG_RCU_EQS_DEBUG=y.
>    */
> -void rcu_idle_exit(void)
> +void noinstr rcu_idle_exit(void)
>   {
>   	unsigned long flags;
>
Jiri Olsa May 17, 2022, 7:49 a.m. UTC | #6
On Mon, May 16, 2022 at 07:54:37PM -0700, Yonghong Song wrote:
> 
> 
> On 5/15/22 1:36 PM, Jiri Olsa wrote:
> > Making arch_cpu_idle and rcu_idle_exit noinstr. Both functions run
> > in rcu 'not watching' context and if there's tracer attached to
> > them, which uses rcu (e.g. kprobe multi interface) it will hit RCU
> > warning like:
> > 
> >    [    3.017540] WARNING: suspicious RCU usage
> >    ...
> >    [    3.018363]  kprobe_multi_link_handler+0x68/0x1c0
> >    [    3.018364]  ? kprobe_multi_link_handler+0x3e/0x1c0
> >    [    3.018366]  ? arch_cpu_idle_dead+0x10/0x10
> >    [    3.018367]  ? arch_cpu_idle_dead+0x10/0x10
> >    [    3.018371]  fprobe_handler.part.0+0xab/0x150
> >    [    3.018374]  0xffffffffa00080c8
> >    [    3.018393]  ? arch_cpu_idle+0x5/0x10
> >    [    3.018398]  arch_cpu_idle+0x5/0x10
> >    [    3.018399]  default_idle_call+0x59/0x90
> >    [    3.018401]  do_idle+0x1c3/0x1d0
> > 
> > The call path is following:
> > 
> > default_idle_call
> >    rcu_idle_enter
> >    arch_cpu_idle
> >    rcu_idle_exit
> > 
> > The arch_cpu_idle and rcu_idle_exit are the only ones from above
> > path that are traceble and cause this problem on my setup.
> > 
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> >   arch/x86/kernel/process.c | 2 +-
> >   kernel/rcu/tree.c         | 2 +-
> >   2 files changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
> > index b370767f5b19..1345cb0124a6 100644
> > --- a/arch/x86/kernel/process.c
> > +++ b/arch/x86/kernel/process.c
> > @@ -720,7 +720,7 @@ void arch_cpu_idle_dead(void)
> >   /*
> >    * Called from the generic idle code.
> >    */
> > -void arch_cpu_idle(void)
> > +void noinstr arch_cpu_idle(void)
> 
> noinstr includes a lot of attributes:
> 
> #define noinstr                                                         \
>         noinline notrace __attribute((__section__(".noinstr.text")))    \
>         __no_kcsan __no_sanitize_address __no_profile __no_sanitize_coverage
> 
> should we use notrace here?

hm right, so notrace should be enough for our case (kprobe_multi)
which is based on ftrace/fprobe jump

noinstr (among other things) adds the function also the kprobes
blacklist, which will prevent standard kprobes to attach

ASAICS standard kprobes use rcu in probe path as well, like in
opt_pre_handler function

so I think we should go with noinstr

jirka

> 
> >   {
> >   	x86_idle();
> >   }
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index a4b8189455d5..20d529722f51 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -896,7 +896,7 @@ static void noinstr rcu_eqs_exit(bool user)
> >    * If you add or remove a call to rcu_idle_exit(), be sure to test with
> >    * CONFIG_RCU_EQS_DEBUG=y.
> >    */
> > -void rcu_idle_exit(void)
> > +void noinstr rcu_idle_exit(void)
> >   {
> >   	unsigned long flags;
kernel test robot May 17, 2022, 9:19 a.m. UTC | #7
Hi Jiri,

I love your patch! Perhaps something to improve:

[auto build test WARNING on bpf-next/master]

url:    https://github.com/intel-lab-lkp/linux/commits/Jiri-Olsa/cpuidle-rcu-Making-arch_cpu_idle-and-rcu_idle_exit-noinstr/20220516-043752
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
config: x86_64-randconfig-a014-20220516 (https://download.01.org/0day-ci/archive/20220517/202205171711.hqxFhp5l-lkp@intel.com/config)
compiler: gcc-11 (Debian 11.2.0-20) 11.2.0
reproduce (this is a W=1 build):
        # https://github.com/intel-lab-lkp/linux/commit/0b6fee32d730f621f2bfc4d8d9f0729814398415
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Jiri-Olsa/cpuidle-rcu-Making-arch_cpu_idle-and-rcu_idle_exit-noinstr/20220516-043752
        git checkout 0b6fee32d730f621f2bfc4d8d9f0729814398415
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        make W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   vmlinux.o: warning: objtool: vmx_l1d_flush()+0x13: call to static_key_count() leaves .noinstr.text section
   vmlinux.o: warning: objtool: vmx_vcpu_enter_exit()+0x29: call to static_key_count() leaves .noinstr.text section
   vmlinux.o: warning: objtool: vmx_update_host_rsp()+0x3e: call to static_key_count() leaves .noinstr.text section
   vmlinux.o: warning: objtool: arch_cpu_idle()+0xb: call to {dynamic}() leaves .noinstr.text section
>> vmlinux.o: warning: objtool: rcu_idle_exit()+0x25: call to trace_hardirqs_on() leaves .noinstr.text section
   vmlinux.o: warning: objtool: enter_from_user_mode()+0x1c: call to __kcsan_check_access() leaves .noinstr.text section
   vmlinux.o: warning: objtool: syscall_enter_from_user_mode()+0x21: call to __kcsan_check_access() leaves .noinstr.text section
   vmlinux.o: warning: objtool: syscall_enter_from_user_mode_prepare()+0x1c: call to __kcsan_check_access() leaves .noinstr.text section
   vmlinux.o: warning: objtool: exit_to_user_mode()+0x1b: call to static_key_count() leaves .noinstr.text section
   vmlinux.o: warning: objtool: syscall_exit_to_user_mode()+0x36: call to static_key_count() leaves .noinstr.text section
   vmlinux.o: warning: objtool: irqentry_enter_from_user_mode()+0x1c: call to __kcsan_check_access() leaves .noinstr.text section
   vmlinux.o: warning: objtool: irqentry_exit_to_user_mode()+0x22: call to static_key_count() leaves .noinstr.text section
Jiri Olsa May 17, 2022, 10:13 a.m. UTC | #8
On Mon, May 16, 2022 at 01:49:22PM +0200, Frederic Weisbecker wrote:
> On Sun, May 15, 2022 at 09:25:35PM -0700, Paul E. McKenney wrote:
> > On Sun, May 15, 2022 at 10:36:52PM +0200, Jiri Olsa wrote:
> > > Making arch_cpu_idle and rcu_idle_exit noinstr. Both functions run
> > > in rcu 'not watching' context and if there's tracer attached to
> > > them, which uses rcu (e.g. kprobe multi interface) it will hit RCU
> > > warning like:
> > > 
> > >   [    3.017540] WARNING: suspicious RCU usage
> > >   ...
> > >   [    3.018363]  kprobe_multi_link_handler+0x68/0x1c0
> > >   [    3.018364]  ? kprobe_multi_link_handler+0x3e/0x1c0
> > >   [    3.018366]  ? arch_cpu_idle_dead+0x10/0x10
> > >   [    3.018367]  ? arch_cpu_idle_dead+0x10/0x10
> > >   [    3.018371]  fprobe_handler.part.0+0xab/0x150
> > >   [    3.018374]  0xffffffffa00080c8
> > >   [    3.018393]  ? arch_cpu_idle+0x5/0x10
> > >   [    3.018398]  arch_cpu_idle+0x5/0x10
> > >   [    3.018399]  default_idle_call+0x59/0x90
> > >   [    3.018401]  do_idle+0x1c3/0x1d0
> > > 
> > > The call path is following:
> > > 
> > > default_idle_call
> > >   rcu_idle_enter
> > >   arch_cpu_idle
> > >   rcu_idle_exit
> > > 
> > > The arch_cpu_idle and rcu_idle_exit are the only ones from above
> > > path that are traceble and cause this problem on my setup.
> > > 
> > > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > 
> > From an RCU viewpoint:
> > 
> > Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
> > 
> > [ I considered asking for an instrumentation_on() in rcu_idle_exit(),
> > but there is no point given that local_irq_restore() isn't something
> > you instrument anyway. ]
> 
> So local_irq_save() in the beginning of rcu_idle_exit() is unsafe because
> it is instrumentable by the function (graph)  tracers and the irqsoff tracer.
> 
> Also it calls into lockdep that might make use of RCU.
> 
> That's why rcu_idle_exit() is not noinstr yet. See this patch:
> 
> https://lore.kernel.org/lkml/20220503100051.2799723-4-frederic@kernel.org/

I see, could we mark it at least with notrace meanwhile?

jirka
Paul E. McKenney May 18, 2022, 4:21 p.m. UTC | #9
On Tue, May 17, 2022 at 12:13:45PM +0200, Jiri Olsa wrote:
> On Mon, May 16, 2022 at 01:49:22PM +0200, Frederic Weisbecker wrote:
> > On Sun, May 15, 2022 at 09:25:35PM -0700, Paul E. McKenney wrote:
> > > On Sun, May 15, 2022 at 10:36:52PM +0200, Jiri Olsa wrote:
> > > > Making arch_cpu_idle and rcu_idle_exit noinstr. Both functions run
> > > > in rcu 'not watching' context and if there's tracer attached to
> > > > them, which uses rcu (e.g. kprobe multi interface) it will hit RCU
> > > > warning like:
> > > > 
> > > >   [    3.017540] WARNING: suspicious RCU usage
> > > >   ...
> > > >   [    3.018363]  kprobe_multi_link_handler+0x68/0x1c0
> > > >   [    3.018364]  ? kprobe_multi_link_handler+0x3e/0x1c0
> > > >   [    3.018366]  ? arch_cpu_idle_dead+0x10/0x10
> > > >   [    3.018367]  ? arch_cpu_idle_dead+0x10/0x10
> > > >   [    3.018371]  fprobe_handler.part.0+0xab/0x150
> > > >   [    3.018374]  0xffffffffa00080c8
> > > >   [    3.018393]  ? arch_cpu_idle+0x5/0x10
> > > >   [    3.018398]  arch_cpu_idle+0x5/0x10
> > > >   [    3.018399]  default_idle_call+0x59/0x90
> > > >   [    3.018401]  do_idle+0x1c3/0x1d0
> > > > 
> > > > The call path is following:
> > > > 
> > > > default_idle_call
> > > >   rcu_idle_enter
> > > >   arch_cpu_idle
> > > >   rcu_idle_exit
> > > > 
> > > > The arch_cpu_idle and rcu_idle_exit are the only ones from above
> > > > path that are traceble and cause this problem on my setup.
> > > > 
> > > > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > > 
> > > From an RCU viewpoint:
> > > 
> > > Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
> > > 
> > > [ I considered asking for an instrumentation_on() in rcu_idle_exit(),
> > > but there is no point given that local_irq_restore() isn't something
> > > you instrument anyway. ]
> > 
> > So local_irq_save() in the beginning of rcu_idle_exit() is unsafe because
> > it is instrumentable by the function (graph)  tracers and the irqsoff tracer.
> > 
> > Also it calls into lockdep that might make use of RCU.
> > 
> > That's why rcu_idle_exit() is not noinstr yet. See this patch:
> > 
> > https://lore.kernel.org/lkml/20220503100051.2799723-4-frederic@kernel.org/
> 
> I see, could we mark it at least with notrace meanwhile?

For the RCU part, how about as follows?

If this approach is reasonable, my guess would be that Frederic will pull
it into his context-tracking series, perhaps using a revert of this patch
to maintain sanity in the near term.

If this approach is unreasonable, well, that is Murphy for you!

For the x86 idle part, my feeling is still that the rcu_idle_enter()
and rcu_idle_exit() need to be pushed deeper into the code.  Perhaps
an ongoing process as the idle loop continues to be dug deeper?

							Thanx, Paul

------------------------------------------------------------------------

commit cd338be719a0a692e0d50e1a8438e1f6c7165d9c
Author: Paul E. McKenney <paulmck@kernel.org>
Date:   Tue May 17 21:00:04 2022 -0700

    rcu: Apply noinstr to rcu_idle_enter() and rcu_idle_exit()
    
    This commit applies the "noinstr" tag to the rcu_idle_enter() and
    rcu_idle_exit() functions, which are invoked from portions of the idle
    loop that cannot be instrumented.  These tags require reworking the
    rcu_eqs_enter() and rcu_eqs_exit() functions that these two functions
    invoke in order to cause them to use normal assertions rather than
    lockdep.  In addition, within rcu_idle_exit(), the raw versions of
    local_irq_save() and local_irq_restore() are used, again to avoid issues
    with lockdep in uninstrumented code.
    
    This patch is based in part on an earlier patch by Jiri Olsa, discussions
    with Peter Zijlstra and Frederic Weisbecker, earlier changes by Thomas
    Gleixner, and off-list discussions with Yonghong Song.
    
    Link: https://lore.kernel.org/lkml/20220515203653.4039075-1-jolsa@kernel.org/
    Reported-by: Jiri Olsa <jolsa@kernel.org>
    Reported-by: Alexei Starovoitov <ast@kernel.org>
    Reported-by: Andrii Nakryiko <andrii@kernel.org>
    Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
    Reviewed-by: Yonghong Song <yhs@fb.com>

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 222d59299a2af..02233b17cce0e 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -635,8 +635,8 @@ static noinstr void rcu_eqs_enter(bool user)
 		return;
 	}
 
-	lockdep_assert_irqs_disabled();
 	instrumentation_begin();
+	lockdep_assert_irqs_disabled();
 	trace_rcu_dyntick(TPS("Start"), rdp->dynticks_nesting, 0, atomic_read(&rdp->dynticks));
 	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
 	rcu_preempt_deferred_qs(current);
@@ -663,9 +663,9 @@ static noinstr void rcu_eqs_enter(bool user)
  * If you add or remove a call to rcu_idle_enter(), be sure to test with
  * CONFIG_RCU_EQS_DEBUG=y.
  */
-void rcu_idle_enter(void)
+void noinstr rcu_idle_enter(void)
 {
-	lockdep_assert_irqs_disabled();
+	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !raw_irqs_disabled());
 	rcu_eqs_enter(false);
 }
 EXPORT_SYMBOL_GPL(rcu_idle_enter);
@@ -865,7 +865,7 @@ static void noinstr rcu_eqs_exit(bool user)
 	struct rcu_data *rdp;
 	long oldval;
 
-	lockdep_assert_irqs_disabled();
+	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !raw_irqs_disabled());
 	rdp = this_cpu_ptr(&rcu_data);
 	oldval = rdp->dynticks_nesting;
 	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && oldval < 0);
@@ -900,13 +900,13 @@ static void noinstr rcu_eqs_exit(bool user)
  * If you add or remove a call to rcu_idle_exit(), be sure to test with
  * CONFIG_RCU_EQS_DEBUG=y.
  */
-void rcu_idle_exit(void)
+void noinstr rcu_idle_exit(void)
 {
 	unsigned long flags;
 
-	local_irq_save(flags);
+	raw_local_irq_save(flags);
 	rcu_eqs_exit(false);
-	local_irq_restore(flags);
+	raw_local_irq_restore(flags);
 }
 EXPORT_SYMBOL_GPL(rcu_idle_exit);
Masami Hiramatsu (Google) May 19, 2022, midnight UTC | #10
On Tue, 17 May 2022 09:49:33 +0200
Jiri Olsa <olsajiri@gmail.com> wrote:

> On Mon, May 16, 2022 at 07:54:37PM -0700, Yonghong Song wrote:
> > 
> > 
> > On 5/15/22 1:36 PM, Jiri Olsa wrote:
> > > Making arch_cpu_idle and rcu_idle_exit noinstr. Both functions run
> > > in rcu 'not watching' context and if there's tracer attached to
> > > them, which uses rcu (e.g. kprobe multi interface) it will hit RCU
> > > warning like:
> > > 
> > >    [    3.017540] WARNING: suspicious RCU usage
> > >    ...
> > >    [    3.018363]  kprobe_multi_link_handler+0x68/0x1c0
> > >    [    3.018364]  ? kprobe_multi_link_handler+0x3e/0x1c0
> > >    [    3.018366]  ? arch_cpu_idle_dead+0x10/0x10
> > >    [    3.018367]  ? arch_cpu_idle_dead+0x10/0x10
> > >    [    3.018371]  fprobe_handler.part.0+0xab/0x150
> > >    [    3.018374]  0xffffffffa00080c8
> > >    [    3.018393]  ? arch_cpu_idle+0x5/0x10
> > >    [    3.018398]  arch_cpu_idle+0x5/0x10
> > >    [    3.018399]  default_idle_call+0x59/0x90
> > >    [    3.018401]  do_idle+0x1c3/0x1d0
> > > 
> > > The call path is following:
> > > 
> > > default_idle_call
> > >    rcu_idle_enter
> > >    arch_cpu_idle
> > >    rcu_idle_exit
> > > 
> > > The arch_cpu_idle and rcu_idle_exit are the only ones from above
> > > path that are traceble and cause this problem on my setup.
> > > 
> > > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > > ---
> > >   arch/x86/kernel/process.c | 2 +-
> > >   kernel/rcu/tree.c         | 2 +-
> > >   2 files changed, 2 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
> > > index b370767f5b19..1345cb0124a6 100644
> > > --- a/arch/x86/kernel/process.c
> > > +++ b/arch/x86/kernel/process.c
> > > @@ -720,7 +720,7 @@ void arch_cpu_idle_dead(void)
> > >   /*
> > >    * Called from the generic idle code.
> > >    */
> > > -void arch_cpu_idle(void)
> > > +void noinstr arch_cpu_idle(void)
> > 
> > noinstr includes a lot of attributes:
> > 
> > #define noinstr                                                         \
> >         noinline notrace __attribute((__section__(".noinstr.text")))    \
> >         __no_kcsan __no_sanitize_address __no_profile __no_sanitize_coverage
> > 
> > should we use notrace here?
> 
> hm right, so notrace should be enough for our case (kprobe_multi)
> which is based on ftrace/fprobe jump
> 
> noinstr (among other things) adds the function also the kprobes
> blacklist, which will prevent standard kprobes to attach
> 
> ASAICS standard kprobes use rcu in probe path as well, like in
> opt_pre_handler function
> 
> so I think we should go with noinstr

Yes, I agree that noinstr is preferrable for these functions.

Thank you!

> 
> jirka
> 
> > 
> > >   {
> > >   	x86_idle();
> > >   }
> > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > index a4b8189455d5..20d529722f51 100644
> > > --- a/kernel/rcu/tree.c
> > > +++ b/kernel/rcu/tree.c
> > > @@ -896,7 +896,7 @@ static void noinstr rcu_eqs_exit(bool user)
> > >    * If you add or remove a call to rcu_idle_exit(), be sure to test with
> > >    * CONFIG_RCU_EQS_DEBUG=y.
> > >    */
> > > -void rcu_idle_exit(void)
> > > +void noinstr rcu_idle_exit(void)
> > >   {
> > >   	unsigned long flags;
Jiri Olsa May 19, 2022, 11:33 a.m. UTC | #11
On Wed, May 18, 2022 at 09:21:18AM -0700, Paul E. McKenney wrote:
> On Tue, May 17, 2022 at 12:13:45PM +0200, Jiri Olsa wrote:
> > On Mon, May 16, 2022 at 01:49:22PM +0200, Frederic Weisbecker wrote:
> > > On Sun, May 15, 2022 at 09:25:35PM -0700, Paul E. McKenney wrote:
> > > > On Sun, May 15, 2022 at 10:36:52PM +0200, Jiri Olsa wrote:
> > > > > Making arch_cpu_idle and rcu_idle_exit noinstr. Both functions run
> > > > > in rcu 'not watching' context and if there's tracer attached to
> > > > > them, which uses rcu (e.g. kprobe multi interface) it will hit RCU
> > > > > warning like:
> > > > > 
> > > > >   [    3.017540] WARNING: suspicious RCU usage
> > > > >   ...
> > > > >   [    3.018363]  kprobe_multi_link_handler+0x68/0x1c0
> > > > >   [    3.018364]  ? kprobe_multi_link_handler+0x3e/0x1c0
> > > > >   [    3.018366]  ? arch_cpu_idle_dead+0x10/0x10
> > > > >   [    3.018367]  ? arch_cpu_idle_dead+0x10/0x10
> > > > >   [    3.018371]  fprobe_handler.part.0+0xab/0x150
> > > > >   [    3.018374]  0xffffffffa00080c8
> > > > >   [    3.018393]  ? arch_cpu_idle+0x5/0x10
> > > > >   [    3.018398]  arch_cpu_idle+0x5/0x10
> > > > >   [    3.018399]  default_idle_call+0x59/0x90
> > > > >   [    3.018401]  do_idle+0x1c3/0x1d0
> > > > > 
> > > > > The call path is following:
> > > > > 
> > > > > default_idle_call
> > > > >   rcu_idle_enter
> > > > >   arch_cpu_idle
> > > > >   rcu_idle_exit
> > > > > 
> > > > > The arch_cpu_idle and rcu_idle_exit are the only ones from above
> > > > > path that are traceble and cause this problem on my setup.
> > > > > 
> > > > > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > > > 
> > > > From an RCU viewpoint:
> > > > 
> > > > Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
> > > > 
> > > > [ I considered asking for an instrumentation_on() in rcu_idle_exit(),
> > > > but there is no point given that local_irq_restore() isn't something
> > > > you instrument anyway. ]
> > > 
> > > So local_irq_save() in the beginning of rcu_idle_exit() is unsafe because
> > > it is instrumentable by the function (graph)  tracers and the irqsoff tracer.
> > > 
> > > Also it calls into lockdep that might make use of RCU.
> > > 
> > > That's why rcu_idle_exit() is not noinstr yet. See this patch:
> > > 
> > > https://lore.kernel.org/lkml/20220503100051.2799723-4-frederic@kernel.org/
> > 
> > I see, could we mark it at least with notrace meanwhile?
> 
> For the RCU part, how about as follows?
> 
> If this approach is reasonable, my guess would be that Frederic will pull
> it into his context-tracking series, perhaps using a revert of this patch
> to maintain sanity in the near term.
> 
> If this approach is unreasonable, well, that is Murphy for you!

I checked and it works in my test ;-)

> 
> For the x86 idle part, my feeling is still that the rcu_idle_enter()
> and rcu_idle_exit() need to be pushed deeper into the code.  Perhaps
> an ongoing process as the idle loop continues to be dug deeper?

for arch_cpu_idle with noinstr I'm getting this W=1 warning:

vmlinux.o: warning: objtool: arch_cpu_idle()+0xb: call to {dynamic}() leaves .noinstr.text section

we could have it with notrace if that's a problem

thanks,
jirka

> 
> 							Thanx, Paul
> 
> ------------------------------------------------------------------------
> 
> commit cd338be719a0a692e0d50e1a8438e1f6c7165d9c
> Author: Paul E. McKenney <paulmck@kernel.org>
> Date:   Tue May 17 21:00:04 2022 -0700
> 
>     rcu: Apply noinstr to rcu_idle_enter() and rcu_idle_exit()
>     
>     This commit applies the "noinstr" tag to the rcu_idle_enter() and
>     rcu_idle_exit() functions, which are invoked from portions of the idle
>     loop that cannot be instrumented.  These tags require reworking the
>     rcu_eqs_enter() and rcu_eqs_exit() functions that these two functions
>     invoke in order to cause them to use normal assertions rather than
>     lockdep.  In addition, within rcu_idle_exit(), the raw versions of
>     local_irq_save() and local_irq_restore() are used, again to avoid issues
>     with lockdep in uninstrumented code.
>     
>     This patch is based in part on an earlier patch by Jiri Olsa, discussions
>     with Peter Zijlstra and Frederic Weisbecker, earlier changes by Thomas
>     Gleixner, and off-list discussions with Yonghong Song.
>     
>     Link: https://lore.kernel.org/lkml/20220515203653.4039075-1-jolsa@kernel.org/
>     Reported-by: Jiri Olsa <jolsa@kernel.org>
>     Reported-by: Alexei Starovoitov <ast@kernel.org>
>     Reported-by: Andrii Nakryiko <andrii@kernel.org>
>     Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
>     Reviewed-by: Yonghong Song <yhs@fb.com>
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 222d59299a2af..02233b17cce0e 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -635,8 +635,8 @@ static noinstr void rcu_eqs_enter(bool user)
>  		return;
>  	}
>  
> -	lockdep_assert_irqs_disabled();
>  	instrumentation_begin();
> +	lockdep_assert_irqs_disabled();
>  	trace_rcu_dyntick(TPS("Start"), rdp->dynticks_nesting, 0, atomic_read(&rdp->dynticks));
>  	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
>  	rcu_preempt_deferred_qs(current);
> @@ -663,9 +663,9 @@ static noinstr void rcu_eqs_enter(bool user)
>   * If you add or remove a call to rcu_idle_enter(), be sure to test with
>   * CONFIG_RCU_EQS_DEBUG=y.
>   */
> -void rcu_idle_enter(void)
> +void noinstr rcu_idle_enter(void)
>  {
> -	lockdep_assert_irqs_disabled();
> +	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !raw_irqs_disabled());
>  	rcu_eqs_enter(false);
>  }
>  EXPORT_SYMBOL_GPL(rcu_idle_enter);
> @@ -865,7 +865,7 @@ static void noinstr rcu_eqs_exit(bool user)
>  	struct rcu_data *rdp;
>  	long oldval;
>  
> -	lockdep_assert_irqs_disabled();
> +	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !raw_irqs_disabled());
>  	rdp = this_cpu_ptr(&rcu_data);
>  	oldval = rdp->dynticks_nesting;
>  	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && oldval < 0);
> @@ -900,13 +900,13 @@ static void noinstr rcu_eqs_exit(bool user)
>   * If you add or remove a call to rcu_idle_exit(), be sure to test with
>   * CONFIG_RCU_EQS_DEBUG=y.
>   */
> -void rcu_idle_exit(void)
> +void noinstr rcu_idle_exit(void)
>  {
>  	unsigned long flags;
>  
> -	local_irq_save(flags);
> +	raw_local_irq_save(flags);
>  	rcu_eqs_exit(false);
> -	local_irq_restore(flags);
> +	raw_local_irq_restore(flags);
>  }
>  EXPORT_SYMBOL_GPL(rcu_idle_exit);
>
Paul E. McKenney May 19, 2022, 1:54 p.m. UTC | #12
On Thu, May 19, 2022 at 01:33:16PM +0200, Jiri Olsa wrote:
> On Wed, May 18, 2022 at 09:21:18AM -0700, Paul E. McKenney wrote:
> > On Tue, May 17, 2022 at 12:13:45PM +0200, Jiri Olsa wrote:
> > > On Mon, May 16, 2022 at 01:49:22PM +0200, Frederic Weisbecker wrote:
> > > > On Sun, May 15, 2022 at 09:25:35PM -0700, Paul E. McKenney wrote:
> > > > > On Sun, May 15, 2022 at 10:36:52PM +0200, Jiri Olsa wrote:
> > > > > > Making arch_cpu_idle and rcu_idle_exit noinstr. Both functions run
> > > > > > in rcu 'not watching' context and if there's tracer attached to
> > > > > > them, which uses rcu (e.g. kprobe multi interface) it will hit RCU
> > > > > > warning like:
> > > > > > 
> > > > > >   [    3.017540] WARNING: suspicious RCU usage
> > > > > >   ...
> > > > > >   [    3.018363]  kprobe_multi_link_handler+0x68/0x1c0
> > > > > >   [    3.018364]  ? kprobe_multi_link_handler+0x3e/0x1c0
> > > > > >   [    3.018366]  ? arch_cpu_idle_dead+0x10/0x10
> > > > > >   [    3.018367]  ? arch_cpu_idle_dead+0x10/0x10
> > > > > >   [    3.018371]  fprobe_handler.part.0+0xab/0x150
> > > > > >   [    3.018374]  0xffffffffa00080c8
> > > > > >   [    3.018393]  ? arch_cpu_idle+0x5/0x10
> > > > > >   [    3.018398]  arch_cpu_idle+0x5/0x10
> > > > > >   [    3.018399]  default_idle_call+0x59/0x90
> > > > > >   [    3.018401]  do_idle+0x1c3/0x1d0
> > > > > > 
> > > > > > The call path is following:
> > > > > > 
> > > > > > default_idle_call
> > > > > >   rcu_idle_enter
> > > > > >   arch_cpu_idle
> > > > > >   rcu_idle_exit
> > > > > > 
> > > > > > The arch_cpu_idle and rcu_idle_exit are the only ones from above
> > > > > > path that are traceble and cause this problem on my setup.
> > > > > > 
> > > > > > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > > > > 
> > > > > From an RCU viewpoint:
> > > > > 
> > > > > Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
> > > > > 
> > > > > [ I considered asking for an instrumentation_on() in rcu_idle_exit(),
> > > > > but there is no point given that local_irq_restore() isn't something
> > > > > you instrument anyway. ]
> > > > 
> > > > So local_irq_save() in the beginning of rcu_idle_exit() is unsafe because
> > > > it is instrumentable by the function (graph)  tracers and the irqsoff tracer.
> > > > 
> > > > Also it calls into lockdep that might make use of RCU.
> > > > 
> > > > That's why rcu_idle_exit() is not noinstr yet. See this patch:
> > > > 
> > > > https://lore.kernel.org/lkml/20220503100051.2799723-4-frederic@kernel.org/
> > > 
> > > I see, could we mark it at least with notrace meanwhile?
> > 
> > For the RCU part, how about as follows?
> > 
> > If this approach is reasonable, my guess would be that Frederic will pull
> > it into his context-tracking series, perhaps using a revert of this patch
> > to maintain sanity in the near term.
> > 
> > If this approach is unreasonable, well, that is Murphy for you!
> 
> I checked and it works in my test ;-)

Whew!!!  One piece of the problem might be solved, then.  ;-)

> > For the x86 idle part, my feeling is still that the rcu_idle_enter()
> > and rcu_idle_exit() need to be pushed deeper into the code.  Perhaps
> > an ongoing process as the idle loop continues to be dug deeper?
> 
> for arch_cpu_idle with noinstr I'm getting this W=1 warning:
> 
> vmlinux.o: warning: objtool: arch_cpu_idle()+0xb: call to {dynamic}() leaves .noinstr.text section
> 
> we could have it with notrace if that's a problem

I would be happy to queue the arch_cpu_idle() portion of your patch on
-rcu, if that would move things forward.  I suspect that additional
x86_idle() surgery is required, but maybe I am just getting confused
about what the x86_idle() function pointer can point to.  But it looks
to me like these need further help:

o	static void amd_e400_idle(void)
	Plus things it calls, like tick_broadcast_enter() and
	tick_broadcast_exit().

o	static __cpuidle void mwait_idle(void)

So it might not be all that much additional work, even if I have avoided
confusion about what the x86_idle() function pointer can point to.  But
I do not trust my ability to test this accurately.

Thoughts?

							Thanx, Paul

> thanks,
> jirka
> 
> > 
> > 							Thanx, Paul
> > 
> > ------------------------------------------------------------------------
> > 
> > commit cd338be719a0a692e0d50e1a8438e1f6c7165d9c
> > Author: Paul E. McKenney <paulmck@kernel.org>
> > Date:   Tue May 17 21:00:04 2022 -0700
> > 
> >     rcu: Apply noinstr to rcu_idle_enter() and rcu_idle_exit()
> >     
> >     This commit applies the "noinstr" tag to the rcu_idle_enter() and
> >     rcu_idle_exit() functions, which are invoked from portions of the idle
> >     loop that cannot be instrumented.  These tags require reworking the
> >     rcu_eqs_enter() and rcu_eqs_exit() functions that these two functions
> >     invoke in order to cause them to use normal assertions rather than
> >     lockdep.  In addition, within rcu_idle_exit(), the raw versions of
> >     local_irq_save() and local_irq_restore() are used, again to avoid issues
> >     with lockdep in uninstrumented code.
> >     
> >     This patch is based in part on an earlier patch by Jiri Olsa, discussions
> >     with Peter Zijlstra and Frederic Weisbecker, earlier changes by Thomas
> >     Gleixner, and off-list discussions with Yonghong Song.
> >     
> >     Link: https://lore.kernel.org/lkml/20220515203653.4039075-1-jolsa@kernel.org/
> >     Reported-by: Jiri Olsa <jolsa@kernel.org>
> >     Reported-by: Alexei Starovoitov <ast@kernel.org>
> >     Reported-by: Andrii Nakryiko <andrii@kernel.org>
> >     Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> >     Reviewed-by: Yonghong Song <yhs@fb.com>
> > 
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index 222d59299a2af..02233b17cce0e 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -635,8 +635,8 @@ static noinstr void rcu_eqs_enter(bool user)
> >  		return;
> >  	}
> >  
> > -	lockdep_assert_irqs_disabled();
> >  	instrumentation_begin();
> > +	lockdep_assert_irqs_disabled();
> >  	trace_rcu_dyntick(TPS("Start"), rdp->dynticks_nesting, 0, atomic_read(&rdp->dynticks));
> >  	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
> >  	rcu_preempt_deferred_qs(current);
> > @@ -663,9 +663,9 @@ static noinstr void rcu_eqs_enter(bool user)
> >   * If you add or remove a call to rcu_idle_enter(), be sure to test with
> >   * CONFIG_RCU_EQS_DEBUG=y.
> >   */
> > -void rcu_idle_enter(void)
> > +void noinstr rcu_idle_enter(void)
> >  {
> > -	lockdep_assert_irqs_disabled();
> > +	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !raw_irqs_disabled());
> >  	rcu_eqs_enter(false);
> >  }
> >  EXPORT_SYMBOL_GPL(rcu_idle_enter);
> > @@ -865,7 +865,7 @@ static void noinstr rcu_eqs_exit(bool user)
> >  	struct rcu_data *rdp;
> >  	long oldval;
> >  
> > -	lockdep_assert_irqs_disabled();
> > +	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !raw_irqs_disabled());
> >  	rdp = this_cpu_ptr(&rcu_data);
> >  	oldval = rdp->dynticks_nesting;
> >  	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && oldval < 0);
> > @@ -900,13 +900,13 @@ static void noinstr rcu_eqs_exit(bool user)
> >   * If you add or remove a call to rcu_idle_exit(), be sure to test with
> >   * CONFIG_RCU_EQS_DEBUG=y.
> >   */
> > -void rcu_idle_exit(void)
> > +void noinstr rcu_idle_exit(void)
> >  {
> >  	unsigned long flags;
> >  
> > -	local_irq_save(flags);
> > +	raw_local_irq_save(flags);
> >  	rcu_eqs_exit(false);
> > -	local_irq_restore(flags);
> > +	raw_local_irq_restore(flags);
> >  }
> >  EXPORT_SYMBOL_GPL(rcu_idle_exit);
> >
Jiri Olsa May 23, 2022, 1:12 p.m. UTC | #13
On Thu, May 19, 2022 at 06:54:39AM -0700, Paul E. McKenney wrote:
> On Thu, May 19, 2022 at 01:33:16PM +0200, Jiri Olsa wrote:
> > On Wed, May 18, 2022 at 09:21:18AM -0700, Paul E. McKenney wrote:
> > > On Tue, May 17, 2022 at 12:13:45PM +0200, Jiri Olsa wrote:
> > > > On Mon, May 16, 2022 at 01:49:22PM +0200, Frederic Weisbecker wrote:
> > > > > On Sun, May 15, 2022 at 09:25:35PM -0700, Paul E. McKenney wrote:
> > > > > > On Sun, May 15, 2022 at 10:36:52PM +0200, Jiri Olsa wrote:
> > > > > > > Making arch_cpu_idle and rcu_idle_exit noinstr. Both functions run
> > > > > > > in rcu 'not watching' context and if there's tracer attached to
> > > > > > > them, which uses rcu (e.g. kprobe multi interface) it will hit RCU
> > > > > > > warning like:
> > > > > > > 
> > > > > > >   [    3.017540] WARNING: suspicious RCU usage
> > > > > > >   ...
> > > > > > >   [    3.018363]  kprobe_multi_link_handler+0x68/0x1c0
> > > > > > >   [    3.018364]  ? kprobe_multi_link_handler+0x3e/0x1c0
> > > > > > >   [    3.018366]  ? arch_cpu_idle_dead+0x10/0x10
> > > > > > >   [    3.018367]  ? arch_cpu_idle_dead+0x10/0x10
> > > > > > >   [    3.018371]  fprobe_handler.part.0+0xab/0x150
> > > > > > >   [    3.018374]  0xffffffffa00080c8
> > > > > > >   [    3.018393]  ? arch_cpu_idle+0x5/0x10
> > > > > > >   [    3.018398]  arch_cpu_idle+0x5/0x10
> > > > > > >   [    3.018399]  default_idle_call+0x59/0x90
> > > > > > >   [    3.018401]  do_idle+0x1c3/0x1d0
> > > > > > > 
> > > > > > > The call path is following:
> > > > > > > 
> > > > > > > default_idle_call
> > > > > > >   rcu_idle_enter
> > > > > > >   arch_cpu_idle
> > > > > > >   rcu_idle_exit
> > > > > > > 
> > > > > > > The arch_cpu_idle and rcu_idle_exit are the only ones from above
> > > > > > > path that are traceble and cause this problem on my setup.
> > > > > > > 
> > > > > > > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > > > > > 
> > > > > > From an RCU viewpoint:
> > > > > > 
> > > > > > Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
> > > > > > 
> > > > > > [ I considered asking for an instrumentation_on() in rcu_idle_exit(),
> > > > > > but there is no point given that local_irq_restore() isn't something
> > > > > > you instrument anyway. ]
> > > > > 
> > > > > So local_irq_save() in the beginning of rcu_idle_exit() is unsafe because
> > > > > it is instrumentable by the function (graph)  tracers and the irqsoff tracer.
> > > > > 
> > > > > Also it calls into lockdep that might make use of RCU.
> > > > > 
> > > > > That's why rcu_idle_exit() is not noinstr yet. See this patch:
> > > > > 
> > > > > https://lore.kernel.org/lkml/20220503100051.2799723-4-frederic@kernel.org/
> > > > 
> > > > I see, could we mark it at least with notrace meanwhile?
> > > 
> > > For the RCU part, how about as follows?
> > > 
> > > If this approach is reasonable, my guess would be that Frederic will pull
> > > it into his context-tracking series, perhaps using a revert of this patch
> > > to maintain sanity in the near term.
> > > 
> > > If this approach is unreasonable, well, that is Murphy for you!
> > 
> > I checked and it works in my test ;-)
> 
> Whew!!!  One piece of the problem might be solved, then.  ;-)
> 
> > > For the x86 idle part, my feeling is still that the rcu_idle_enter()
> > > and rcu_idle_exit() need to be pushed deeper into the code.  Perhaps
> > > an ongoing process as the idle loop continues to be dug deeper?
> > 
> > for arch_cpu_idle with noinstr I'm getting this W=1 warning:
> > 
> > vmlinux.o: warning: objtool: arch_cpu_idle()+0xb: call to {dynamic}() leaves .noinstr.text section
> > 
> > we could have it with notrace if that's a problem
> 
> I would be happy to queue the arch_cpu_idle() portion of your patch on
> -rcu, if that would move things forward.  I suspect that additional
> x86_idle() surgery is required, but maybe I am just getting confused
> about what the x86_idle() function pointer can point to.  But it looks
> to me like these need further help:
> 
> o	static void amd_e400_idle(void)
> 	Plus things it calls, like tick_broadcast_enter() and
> 	tick_broadcast_exit().
> 
> o	static __cpuidle void mwait_idle(void)
> 
> So it might not be all that much additional work, even if I have avoided
> confusion about what the x86_idle() function pointer can point to.  But
> I do not trust my ability to test this accurately.

same here ;-) you're right, there will be other places based
on x86_idle function pointer.. I'll check it, but perhaps we
could address that when someone reports that

jirka
Paul E. McKenney May 24, 2022, 5:33 p.m. UTC | #14
On Mon, May 23, 2022 at 03:12:28PM +0200, Jiri Olsa wrote:
> On Thu, May 19, 2022 at 06:54:39AM -0700, Paul E. McKenney wrote:
> > On Thu, May 19, 2022 at 01:33:16PM +0200, Jiri Olsa wrote:
> > > On Wed, May 18, 2022 at 09:21:18AM -0700, Paul E. McKenney wrote:
> > > > On Tue, May 17, 2022 at 12:13:45PM +0200, Jiri Olsa wrote:
> > > > > On Mon, May 16, 2022 at 01:49:22PM +0200, Frederic Weisbecker wrote:
> > > > > > On Sun, May 15, 2022 at 09:25:35PM -0700, Paul E. McKenney wrote:
> > > > > > > On Sun, May 15, 2022 at 10:36:52PM +0200, Jiri Olsa wrote:
> > > > > > > > Making arch_cpu_idle and rcu_idle_exit noinstr. Both functions run
> > > > > > > > in rcu 'not watching' context and if there's tracer attached to
> > > > > > > > them, which uses rcu (e.g. kprobe multi interface) it will hit RCU
> > > > > > > > warning like:
> > > > > > > > 
> > > > > > > >   [    3.017540] WARNING: suspicious RCU usage
> > > > > > > >   ...
> > > > > > > >   [    3.018363]  kprobe_multi_link_handler+0x68/0x1c0
> > > > > > > >   [    3.018364]  ? kprobe_multi_link_handler+0x3e/0x1c0
> > > > > > > >   [    3.018366]  ? arch_cpu_idle_dead+0x10/0x10
> > > > > > > >   [    3.018367]  ? arch_cpu_idle_dead+0x10/0x10
> > > > > > > >   [    3.018371]  fprobe_handler.part.0+0xab/0x150
> > > > > > > >   [    3.018374]  0xffffffffa00080c8
> > > > > > > >   [    3.018393]  ? arch_cpu_idle+0x5/0x10
> > > > > > > >   [    3.018398]  arch_cpu_idle+0x5/0x10
> > > > > > > >   [    3.018399]  default_idle_call+0x59/0x90
> > > > > > > >   [    3.018401]  do_idle+0x1c3/0x1d0
> > > > > > > > 
> > > > > > > > The call path is following:
> > > > > > > > 
> > > > > > > > default_idle_call
> > > > > > > >   rcu_idle_enter
> > > > > > > >   arch_cpu_idle
> > > > > > > >   rcu_idle_exit
> > > > > > > > 
> > > > > > > > The arch_cpu_idle and rcu_idle_exit are the only ones from above
> > > > > > > > path that are traceble and cause this problem on my setup.
> > > > > > > > 
> > > > > > > > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > > > > > > 
> > > > > > > From an RCU viewpoint:
> > > > > > > 
> > > > > > > Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
> > > > > > > 
> > > > > > > [ I considered asking for an instrumentation_on() in rcu_idle_exit(),
> > > > > > > but there is no point given that local_irq_restore() isn't something
> > > > > > > you instrument anyway. ]
> > > > > > 
> > > > > > So local_irq_save() in the beginning of rcu_idle_exit() is unsafe because
> > > > > > it is instrumentable by the function (graph)  tracers and the irqsoff tracer.
> > > > > > 
> > > > > > Also it calls into lockdep that might make use of RCU.
> > > > > > 
> > > > > > That's why rcu_idle_exit() is not noinstr yet. See this patch:
> > > > > > 
> > > > > > https://lore.kernel.org/lkml/20220503100051.2799723-4-frederic@kernel.org/
> > > > > 
> > > > > I see, could we mark it at least with notrace meanwhile?
> > > > 
> > > > For the RCU part, how about as follows?
> > > > 
> > > > If this approach is reasonable, my guess would be that Frederic will pull
> > > > it into his context-tracking series, perhaps using a revert of this patch
> > > > to maintain sanity in the near term.
> > > > 
> > > > If this approach is unreasonable, well, that is Murphy for you!
> > > 
> > > I checked and it works in my test ;-)
> > 
> > Whew!!!  One piece of the problem might be solved, then.  ;-)
> > 
> > > > For the x86 idle part, my feeling is still that the rcu_idle_enter()
> > > > and rcu_idle_exit() need to be pushed deeper into the code.  Perhaps
> > > > an ongoing process as the idle loop continues to be dug deeper?
> > > 
> > > for arch_cpu_idle with noinstr I'm getting this W=1 warning:
> > > 
> > > vmlinux.o: warning: objtool: arch_cpu_idle()+0xb: call to {dynamic}() leaves .noinstr.text section
> > > 
> > > we could have it with notrace if that's a problem
> > 
> > I would be happy to queue the arch_cpu_idle() portion of your patch on
> > -rcu, if that would move things forward.  I suspect that additional
> > x86_idle() surgery is required, but maybe I am just getting confused
> > about what the x86_idle() function pointer can point to.  But it looks
> > to me like these need further help:
> > 
> > o	static void amd_e400_idle(void)
> > 	Plus things it calls, like tick_broadcast_enter() and
> > 	tick_broadcast_exit().
> > 
> > o	static __cpuidle void mwait_idle(void)
> > 
> > So it might not be all that much additional work, even if I have avoided
> > confusion about what the x86_idle() function pointer can point to.  But
> > I do not trust my ability to test this accurately.
> 
> same here ;-) you're right, there will be other places based
> on x86_idle function pointer.. I'll check it, but perhaps we
> could address that when someone reports that
> 
> jirka

Any thoughts on the correct approach?  One extreme would be to
mark all sorts of things noinstr.  Another extreme would be to
enclose all sorts of things in RCU_NONIDLE().  Yet another extreme
would be to push the rcu_idle_enter() and rcu_idle_exit() calls
still deeper into the idle loop.

Or does Peter's recent series somehow cover all of this?

https://lore.kernel.org/all/20220519212750.656413111@infradead.org/

							Thanx, Paul
diff mbox series

Patch

diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index b370767f5b19..1345cb0124a6 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -720,7 +720,7 @@  void arch_cpu_idle_dead(void)
 /*
  * Called from the generic idle code.
  */
-void arch_cpu_idle(void)
+void noinstr arch_cpu_idle(void)
 {
 	x86_idle();
 }
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index a4b8189455d5..20d529722f51 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -896,7 +896,7 @@  static void noinstr rcu_eqs_exit(bool user)
  * If you add or remove a call to rcu_idle_exit(), be sure to test with
  * CONFIG_RCU_EQS_DEBUG=y.
  */
-void rcu_idle_exit(void)
+void noinstr rcu_idle_exit(void)
 {
 	unsigned long flags;