diff mbox series

[rcu,13/17] srcu: Add SRCU-fast readers

Message ID 20250116202112.3783327-13-paulmck@kernel.org (mailing list archive)
State New
Headers show
Series SRCU updates, including SRCU-fast | expand

Commit Message

Paul E. McKenney Jan. 16, 2025, 8:21 p.m. UTC
This commit adds srcu_read_{,un}lock_fast(), which is similar
to srcu_read_{,un}lock_lite(), but avoids the array-indexing and
pointer-following overhead.  On a microbenchmark featuring tight
loops around empty readers, this results in about a 20% speedup
compared to RCU Tasks Trace on my x86 laptop.

Please note that SRCU-fast has drawbacks compared to RCU Tasks
Trace, including:

o	Lack of CPU stall warnings.
o	SRCU-fast readers permitted only where rcu_is_watching().
o	A pointer-sized return value from srcu_read_lock_fast() must
	be passed to the corresponding srcu_read_unlock_fast().
o	In the absence of readers, a synchronize_srcu() having _fast()
	readers will incur the latency of at least two normal RCU grace
	periods.
o	RCU Tasks Trace priority boosting could be easily added.
	Boosting SRCU readers is more difficult.

SRCU-fast also has a drawback compared to SRCU-lite, namely that the
return value from srcu_read_lock_fast()-fast is a 64-bit pointer and
that from srcu_read_lock_lite() is only a 32-bit int.

[ paulmck: Apply feedback from Akira Yokosawa. ]

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: <bpf@vger.kernel.org>
---
 include/linux/srcu.h     | 47 ++++++++++++++++++++++++++++++++++++++--
 include/linux/srcutiny.h | 22 +++++++++++++++++++
 include/linux/srcutree.h | 38 ++++++++++++++++++++++++++++++++
 3 files changed, 105 insertions(+), 2 deletions(-)

Comments

Alexei Starovoitov Jan. 16, 2025, 9 p.m. UTC | #1
On Thu, Jan 16, 2025 at 12:21 PM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> +/*
> + * Counts the new reader in the appropriate per-CPU element of the
> + * srcu_struct.  Returns a pointer that must be passed to the matching
> + * srcu_read_unlock_fast().
> + *
> + * Note that this_cpu_inc() is an RCU read-side critical section either
> + * because it disables interrupts, because it is a single instruction,
> + * or because it is a read-modify-write atomic operation, depending on
> + * the whims of the architecture.
> + */
> +static inline struct srcu_ctr __percpu *__srcu_read_lock_fast(struct srcu_struct *ssp)
> +{
> +       struct srcu_ctr __percpu *scp = READ_ONCE(ssp->srcu_ctrp);
> +
> +       RCU_LOCKDEP_WARN(!rcu_is_watching(), "RCU must be watching srcu_read_lock_fast().");
> +       this_cpu_inc(scp->srcu_locks.counter); /* Y */
> +       barrier(); /* Avoid leaking the critical section. */
> +       return scp;
> +}

This doesn't look fast.
If I'm reading this correctly,
even with debugs off RCU_LOCKDEP_WARN() will still call
rcu_is_watching() and this doesn't look cheap or fast.
Peter Zijlstra Jan. 16, 2025, 9:38 p.m. UTC | #2
On Thu, Jan 16, 2025 at 01:00:24PM -0800, Alexei Starovoitov wrote:
> On Thu, Jan 16, 2025 at 12:21 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > +/*
> > + * Counts the new reader in the appropriate per-CPU element of the
> > + * srcu_struct.  Returns a pointer that must be passed to the matching
> > + * srcu_read_unlock_fast().
> > + *
> > + * Note that this_cpu_inc() is an RCU read-side critical section either
> > + * because it disables interrupts, because it is a single instruction,
> > + * or because it is a read-modify-write atomic operation, depending on
> > + * the whims of the architecture.
> > + */
> > +static inline struct srcu_ctr __percpu *__srcu_read_lock_fast(struct srcu_struct *ssp)
> > +{
> > +       struct srcu_ctr __percpu *scp = READ_ONCE(ssp->srcu_ctrp);
> > +
> > +       RCU_LOCKDEP_WARN(!rcu_is_watching(), "RCU must be watching srcu_read_lock_fast().");
> > +       this_cpu_inc(scp->srcu_locks.counter); /* Y */
> > +       barrier(); /* Avoid leaking the critical section. */
> > +       return scp;
> > +}
> 
> This doesn't look fast.
> If I'm reading this correctly,
> even with debugs off RCU_LOCKDEP_WARN() will still call
> rcu_is_watching() and this doesn't look cheap or fast.

this is the while (0 && (c)), thing, right? can't the compiler throw
that out in DCE ?
Andrii Nakryiko Jan. 16, 2025, 9:52 p.m. UTC | #3
On Thu, Jan 16, 2025 at 12:21 PM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> This commit adds srcu_read_{,un}lock_fast(), which is similar
> to srcu_read_{,un}lock_lite(), but avoids the array-indexing and
> pointer-following overhead.  On a microbenchmark featuring tight
> loops around empty readers, this results in about a 20% speedup
> compared to RCU Tasks Trace on my x86 laptop.
>
> Please note that SRCU-fast has drawbacks compared to RCU Tasks
> Trace, including:
>
> o       Lack of CPU stall warnings.
> o       SRCU-fast readers permitted only where rcu_is_watching().
> o       A pointer-sized return value from srcu_read_lock_fast() must
>         be passed to the corresponding srcu_read_unlock_fast().
> o       In the absence of readers, a synchronize_srcu() having _fast()
>         readers will incur the latency of at least two normal RCU grace
>         periods.
> o       RCU Tasks Trace priority boosting could be easily added.
>         Boosting SRCU readers is more difficult.
>
> SRCU-fast also has a drawback compared to SRCU-lite, namely that the
> return value from srcu_read_lock_fast()-fast is a 64-bit pointer and
> that from srcu_read_lock_lite() is only a 32-bit int.
>
> [ paulmck: Apply feedback from Akira Yokosawa. ]
>
> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> Cc: Alexei Starovoitov <ast@kernel.org>
> Cc: Andrii Nakryiko <andrii@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Kent Overstreet <kent.overstreet@linux.dev>
> Cc: <bpf@vger.kernel.org>
> ---
>  include/linux/srcu.h     | 47 ++++++++++++++++++++++++++++++++++++++--
>  include/linux/srcutiny.h | 22 +++++++++++++++++++
>  include/linux/srcutree.h | 38 ++++++++++++++++++++++++++++++++
>  3 files changed, 105 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/srcu.h b/include/linux/srcu.h
> index 2bd0e24e9b554..63bddc3014238 100644
> --- a/include/linux/srcu.h
> +++ b/include/linux/srcu.h
> @@ -47,9 +47,10 @@ int init_srcu_struct(struct srcu_struct *ssp);
>  #define SRCU_READ_FLAVOR_NORMAL        0x1             // srcu_read_lock().
>  #define SRCU_READ_FLAVOR_NMI   0x2             // srcu_read_lock_nmisafe().
>  #define SRCU_READ_FLAVOR_LITE  0x4             // srcu_read_lock_lite().
> +#define SRCU_READ_FLAVOR_FAST  0x8             // srcu_read_lock_fast().
>  #define SRCU_READ_FLAVOR_ALL   (SRCU_READ_FLAVOR_NORMAL | SRCU_READ_FLAVOR_NMI | \
> -                               SRCU_READ_FLAVOR_LITE) // All of the above.
> -#define SRCU_READ_FLAVOR_SLOWGP        SRCU_READ_FLAVOR_LITE
> +                               SRCU_READ_FLAVOR_LITE | SRCU_READ_FLAVOR_FAST) // All of the above.
> +#define SRCU_READ_FLAVOR_SLOWGP        (SRCU_READ_FLAVOR_LITE | SRCU_READ_FLAVOR_FAST)
>                                                 // Flavors requiring synchronize_rcu()
>                                                 // instead of smp_mb().
>  void __srcu_read_unlock(struct srcu_struct *ssp, int idx) __releases(ssp);
> @@ -253,6 +254,33 @@ static inline int srcu_read_lock(struct srcu_struct *ssp) __acquires(ssp)
>         return retval;
>  }
>
> +/**
> + * srcu_read_lock_fast - register a new reader for an SRCU-protected structure.
> + * @ssp: srcu_struct in which to register the new reader.
> + *
> + * Enter an SRCU read-side critical section, but for a light-weight
> + * smp_mb()-free reader.  See srcu_read_lock() for more information.
> + *
> + * If srcu_read_lock_fast() is ever used on an srcu_struct structure,
> + * then none of the other flavors may be used, whether before, during,
> + * or after.  Note that grace-period auto-expediting is disabled for _fast
> + * srcu_struct structures because auto-expedited grace periods invoke
> + * synchronize_rcu_expedited(), IPIs and all.
> + *
> + * Note that srcu_read_lock_fast() can be invoked only from those contexts
> + * where RCU is watching, that is, from contexts where it would be legal
> + * to invoke rcu_read_lock().  Otherwise, lockdep will complain.
> + */
> +static inline struct srcu_ctr __percpu *srcu_read_lock_fast(struct srcu_struct *ssp) __acquires(ssp)
> +{
> +       struct srcu_ctr __percpu *retval;
> +
> +       srcu_check_read_flavor_force(ssp, SRCU_READ_FLAVOR_FAST);
> +       retval = __srcu_read_lock_fast(ssp);
> +       rcu_try_lock_acquire(&ssp->dep_map);
> +       return retval;
> +}
> +
>  /**
>   * srcu_read_lock_lite - register a new reader for an SRCU-protected structure.
>   * @ssp: srcu_struct in which to register the new reader.
> @@ -356,6 +384,21 @@ static inline void srcu_read_unlock(struct srcu_struct *ssp, int idx)
>         __srcu_read_unlock(ssp, idx);
>  }
>
> +/**
> + * srcu_read_unlock_fast - unregister a old reader from an SRCU-protected structure.
> + * @ssp: srcu_struct in which to unregister the old reader.
> + * @scp: return value from corresponding srcu_read_lock_fast().
> + *
> + * Exit a light-weight SRCU read-side critical section.
> + */
> +static inline void srcu_read_unlock_fast(struct srcu_struct *ssp, struct srcu_ctr __percpu *scp)
> +       __releases(ssp)
> +{
> +       srcu_check_read_flavor(ssp, SRCU_READ_FLAVOR_FAST);
> +       srcu_lock_release(&ssp->dep_map);
> +       __srcu_read_unlock_fast(ssp, scp);
> +}
> +
>  /**
>   * srcu_read_unlock_lite - unregister a old reader from an SRCU-protected structure.
>   * @ssp: srcu_struct in which to unregister the old reader.
> diff --git a/include/linux/srcutiny.h b/include/linux/srcutiny.h
> index 07a0c4489ea2f..380260317d98b 100644
> --- a/include/linux/srcutiny.h
> +++ b/include/linux/srcutiny.h
> @@ -71,6 +71,28 @@ static inline int __srcu_read_lock(struct srcu_struct *ssp)
>         return idx;
>  }
>
> +struct srcu_ctr;
> +
> +static inline bool __srcu_ptr_to_ctr(struct srcu_struct *ssp, struct srcu_ctr __percpu *scpp)
> +{
> +       return (int)(intptr_t)(struct srcu_ctr __force __kernel *)scpp;
> +}
> +
> +static inline struct srcu_ctr __percpu *__srcu_ctr_to_ptr(struct srcu_struct *ssp, int idx)
> +{
> +       return (struct srcu_ctr __percpu *)(intptr_t)idx;
> +}
> +
> +static inline struct srcu_ctr __percpu *__srcu_read_lock_fast(struct srcu_struct *ssp)
> +{
> +       return __srcu_ctr_to_ptr(ssp, __srcu_read_lock(ssp));
> +}
> +
> +static inline void __srcu_read_unlock_fast(struct srcu_struct *ssp, struct srcu_ctr __percpu *scp)
> +{
> +       __srcu_read_unlock(ssp, __srcu_ptr_to_ctr(ssp, scp));
> +}
> +
>  #define __srcu_read_lock_lite __srcu_read_lock
>  #define __srcu_read_unlock_lite __srcu_read_unlock
>
> diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
> index ef3065c0cadcd..bdc467efce3a2 100644
> --- a/include/linux/srcutree.h
> +++ b/include/linux/srcutree.h
> @@ -226,6 +226,44 @@ static inline struct srcu_ctr __percpu *__srcu_ctr_to_ptr(struct srcu_struct *ss
>         return &ssp->sda->srcu_ctrs[idx];
>  }
>
> +/*
> + * Counts the new reader in the appropriate per-CPU element of the
> + * srcu_struct.  Returns a pointer that must be passed to the matching
> + * srcu_read_unlock_fast().
> + *
> + * Note that this_cpu_inc() is an RCU read-side critical section either
> + * because it disables interrupts, because it is a single instruction,
> + * or because it is a read-modify-write atomic operation, depending on
> + * the whims of the architecture.
> + */
> +static inline struct srcu_ctr __percpu *__srcu_read_lock_fast(struct srcu_struct *ssp)
> +{
> +       struct srcu_ctr __percpu *scp = READ_ONCE(ssp->srcu_ctrp);
> +
> +       RCU_LOCKDEP_WARN(!rcu_is_watching(), "RCU must be watching srcu_read_lock_fast().");
> +       this_cpu_inc(scp->srcu_locks.counter); /* Y */
> +       barrier(); /* Avoid leaking the critical section. */
> +       return scp;
> +}
> +
> +/*
> + * Removes the count for the old reader from the appropriate
> + * per-CPU element of the srcu_struct.  Note that this may well be a
> + * different CPU than that which was incremented by the corresponding
> + * srcu_read_lock_fast(), but it must be within the same task.

hm... why the "same task" restriction? With uretprobes we take
srcu_read_lock under a traced task, but we can "release" this lock
from timer interrupt, which could be in the context of any task.

> + *
> + * Note that this_cpu_inc() is an RCU read-side critical section either
> + * because it disables interrupts, because it is a single instruction,
> + * or because it is a read-modify-write atomic operation, depending on
> + * the whims of the architecture.
> + */
> +static inline void __srcu_read_unlock_fast(struct srcu_struct *ssp, struct srcu_ctr __percpu *scp)
> +{
> +       barrier();  /* Avoid leaking the critical section. */
> +       this_cpu_inc(scp->srcu_unlocks.counter);  /* Z */
> +       RCU_LOCKDEP_WARN(!rcu_is_watching(), "RCU must be watching srcu_read_unlock_fast().");
> +}
> +
>  /*
>   * Counts the new reader in the appropriate per-CPU element of the
>   * srcu_struct.  Returns an index that must be passed to the matching
> --
> 2.40.1
>
Paul E. McKenney Jan. 16, 2025, 9:55 p.m. UTC | #4
On Thu, Jan 16, 2025 at 01:00:24PM -0800, Alexei Starovoitov wrote:
> On Thu, Jan 16, 2025 at 12:21 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > +/*
> > + * Counts the new reader in the appropriate per-CPU element of the
> > + * srcu_struct.  Returns a pointer that must be passed to the matching
> > + * srcu_read_unlock_fast().
> > + *
> > + * Note that this_cpu_inc() is an RCU read-side critical section either
> > + * because it disables interrupts, because it is a single instruction,
> > + * or because it is a read-modify-write atomic operation, depending on
> > + * the whims of the architecture.
> > + */
> > +static inline struct srcu_ctr __percpu *__srcu_read_lock_fast(struct srcu_struct *ssp)
> > +{
> > +       struct srcu_ctr __percpu *scp = READ_ONCE(ssp->srcu_ctrp);
> > +
> > +       RCU_LOCKDEP_WARN(!rcu_is_watching(), "RCU must be watching srcu_read_lock_fast().");
> > +       this_cpu_inc(scp->srcu_locks.counter); /* Y */
> > +       barrier(); /* Avoid leaking the critical section. */
> > +       return scp;
> > +}
> 
> This doesn't look fast.
> If I'm reading this correctly,
> even with debugs off RCU_LOCKDEP_WARN() will still call
> rcu_is_watching() and this doesn't look cheap or fast.

Here is the CONFIG_PROVE_RCU=n definition:

	#define RCU_LOCKDEP_WARN(c, s) do { } while (0 && (c))

The "0" in the "0 && (c)" should prevent that call to rcu_is_watching().

But why not see what the compiler thinks?  I added the following function
to kernel/rcu/srcutree.c:

struct srcu_ctr __percpu *test_srcu_read_lock_fast(struct srcu_struct *ssp)
{
	struct srcu_ctr __percpu *p;

	p = srcu_read_lock_fast(ssp);
	return p;
}

This function compiles to the following code:

Dump of assembler code for function test_srcu_read_lock_fast:
   0xffffffff811220c0 <+0>:     endbr64
   0xffffffff811220c4 <+4>:     sub    $0x8,%rsp
   0xffffffff811220c8 <+8>:     mov    0x8(%rdi),%rax
   0xffffffff811220cc <+12>:    add    %gs:0x7eef3944(%rip),%rax        # 0x15a18 <this_cpu_off>
   0xffffffff811220d4 <+20>:    mov    0x20(%rax),%eax
   0xffffffff811220d7 <+23>:    test   $0x8,%al
   0xffffffff811220d9 <+25>:    je     0xffffffff811220eb <test_srcu_read_lock_fast+43>
   0xffffffff811220db <+27>:    mov    (%rdi),%rax
   0xffffffff811220de <+30>:    incq   %gs:(%rax)
   0xffffffff811220e2 <+34>:    add    $0x8,%rsp
   0xffffffff811220e6 <+38>:    jmp    0xffffffff81f5fe60 <__x86_return_thunk>
   0xffffffff811220eb <+43>:    mov    $0x8,%esi
   0xffffffff811220f0 <+48>:    mov    %rdi,(%rsp)
   0xffffffff811220f4 <+52>:    call   0xffffffff8111fb90 <__srcu_check_read_flavor>
   0xffffffff811220f9 <+57>:    mov    (%rsp),%rdi
   0xffffffff811220fd <+61>:    jmp    0xffffffff811220db <test_srcu_read_lock_fast+27>

The first call to srcu_read_lock_fast() invokes __srcu_check_read_flavor(),
but after that the "je" instruction will fall through.  So the common-case
code path executes only the part of this function up to and including the
"jmp 0xffffffff81f5fe60 <__x86_return_thunk>".

Does that serve?

							Thanx, Paul
Paul E. McKenney Jan. 16, 2025, 10:54 p.m. UTC | #5
On Thu, Jan 16, 2025 at 01:52:53PM -0800, Andrii Nakryiko wrote:
> On Thu, Jan 16, 2025 at 12:21 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > This commit adds srcu_read_{,un}lock_fast(), which is similar
> > to srcu_read_{,un}lock_lite(), but avoids the array-indexing and
> > pointer-following overhead.  On a microbenchmark featuring tight
> > loops around empty readers, this results in about a 20% speedup
> > compared to RCU Tasks Trace on my x86 laptop.
> >
> > Please note that SRCU-fast has drawbacks compared to RCU Tasks
> > Trace, including:
> >
> > o       Lack of CPU stall warnings.
> > o       SRCU-fast readers permitted only where rcu_is_watching().
> > o       A pointer-sized return value from srcu_read_lock_fast() must
> >         be passed to the corresponding srcu_read_unlock_fast().
> > o       In the absence of readers, a synchronize_srcu() having _fast()
> >         readers will incur the latency of at least two normal RCU grace
> >         periods.
> > o       RCU Tasks Trace priority boosting could be easily added.
> >         Boosting SRCU readers is more difficult.
> >
> > SRCU-fast also has a drawback compared to SRCU-lite, namely that the
> > return value from srcu_read_lock_fast()-fast is a 64-bit pointer and
> > that from srcu_read_lock_lite() is only a 32-bit int.
> >
> > [ paulmck: Apply feedback from Akira Yokosawa. ]
> >
> > Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> > Cc: Alexei Starovoitov <ast@kernel.org>
> > Cc: Andrii Nakryiko <andrii@kernel.org>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Kent Overstreet <kent.overstreet@linux.dev>
> > Cc: <bpf@vger.kernel.org>
> > ---
> >  include/linux/srcu.h     | 47 ++++++++++++++++++++++++++++++++++++++--
> >  include/linux/srcutiny.h | 22 +++++++++++++++++++
> >  include/linux/srcutree.h | 38 ++++++++++++++++++++++++++++++++
> >  3 files changed, 105 insertions(+), 2 deletions(-)
> >
> > diff --git a/include/linux/srcu.h b/include/linux/srcu.h
> > index 2bd0e24e9b554..63bddc3014238 100644
> > --- a/include/linux/srcu.h
> > +++ b/include/linux/srcu.h
> > @@ -47,9 +47,10 @@ int init_srcu_struct(struct srcu_struct *ssp);
> >  #define SRCU_READ_FLAVOR_NORMAL        0x1             // srcu_read_lock().
> >  #define SRCU_READ_FLAVOR_NMI   0x2             // srcu_read_lock_nmisafe().
> >  #define SRCU_READ_FLAVOR_LITE  0x4             // srcu_read_lock_lite().
> > +#define SRCU_READ_FLAVOR_FAST  0x8             // srcu_read_lock_fast().
> >  #define SRCU_READ_FLAVOR_ALL   (SRCU_READ_FLAVOR_NORMAL | SRCU_READ_FLAVOR_NMI | \
> > -                               SRCU_READ_FLAVOR_LITE) // All of the above.
> > -#define SRCU_READ_FLAVOR_SLOWGP        SRCU_READ_FLAVOR_LITE
> > +                               SRCU_READ_FLAVOR_LITE | SRCU_READ_FLAVOR_FAST) // All of the above.
> > +#define SRCU_READ_FLAVOR_SLOWGP        (SRCU_READ_FLAVOR_LITE | SRCU_READ_FLAVOR_FAST)
> >                                                 // Flavors requiring synchronize_rcu()
> >                                                 // instead of smp_mb().
> >  void __srcu_read_unlock(struct srcu_struct *ssp, int idx) __releases(ssp);
> > @@ -253,6 +254,33 @@ static inline int srcu_read_lock(struct srcu_struct *ssp) __acquires(ssp)
> >         return retval;
> >  }
> >
> > +/**
> > + * srcu_read_lock_fast - register a new reader for an SRCU-protected structure.
> > + * @ssp: srcu_struct in which to register the new reader.
> > + *
> > + * Enter an SRCU read-side critical section, but for a light-weight
> > + * smp_mb()-free reader.  See srcu_read_lock() for more information.
> > + *
> > + * If srcu_read_lock_fast() is ever used on an srcu_struct structure,
> > + * then none of the other flavors may be used, whether before, during,
> > + * or after.  Note that grace-period auto-expediting is disabled for _fast
> > + * srcu_struct structures because auto-expedited grace periods invoke
> > + * synchronize_rcu_expedited(), IPIs and all.
> > + *
> > + * Note that srcu_read_lock_fast() can be invoked only from those contexts
> > + * where RCU is watching, that is, from contexts where it would be legal
> > + * to invoke rcu_read_lock().  Otherwise, lockdep will complain.
> > + */
> > +static inline struct srcu_ctr __percpu *srcu_read_lock_fast(struct srcu_struct *ssp) __acquires(ssp)
> > +{
> > +       struct srcu_ctr __percpu *retval;
> > +
> > +       srcu_check_read_flavor_force(ssp, SRCU_READ_FLAVOR_FAST);
> > +       retval = __srcu_read_lock_fast(ssp);
> > +       rcu_try_lock_acquire(&ssp->dep_map);
> > +       return retval;
> > +}
> > +
> >  /**
> >   * srcu_read_lock_lite - register a new reader for an SRCU-protected structure.
> >   * @ssp: srcu_struct in which to register the new reader.
> > @@ -356,6 +384,21 @@ static inline void srcu_read_unlock(struct srcu_struct *ssp, int idx)
> >         __srcu_read_unlock(ssp, idx);
> >  }
> >
> > +/**
> > + * srcu_read_unlock_fast - unregister a old reader from an SRCU-protected structure.
> > + * @ssp: srcu_struct in which to unregister the old reader.
> > + * @scp: return value from corresponding srcu_read_lock_fast().
> > + *
> > + * Exit a light-weight SRCU read-side critical section.
> > + */
> > +static inline void srcu_read_unlock_fast(struct srcu_struct *ssp, struct srcu_ctr __percpu *scp)
> > +       __releases(ssp)
> > +{
> > +       srcu_check_read_flavor(ssp, SRCU_READ_FLAVOR_FAST);
> > +       srcu_lock_release(&ssp->dep_map);
> > +       __srcu_read_unlock_fast(ssp, scp);
> > +}
> > +
> >  /**
> >   * srcu_read_unlock_lite - unregister a old reader from an SRCU-protected structure.
> >   * @ssp: srcu_struct in which to unregister the old reader.
> > diff --git a/include/linux/srcutiny.h b/include/linux/srcutiny.h
> > index 07a0c4489ea2f..380260317d98b 100644
> > --- a/include/linux/srcutiny.h
> > +++ b/include/linux/srcutiny.h
> > @@ -71,6 +71,28 @@ static inline int __srcu_read_lock(struct srcu_struct *ssp)
> >         return idx;
> >  }
> >
> > +struct srcu_ctr;
> > +
> > +static inline bool __srcu_ptr_to_ctr(struct srcu_struct *ssp, struct srcu_ctr __percpu *scpp)
> > +{
> > +       return (int)(intptr_t)(struct srcu_ctr __force __kernel *)scpp;
> > +}
> > +
> > +static inline struct srcu_ctr __percpu *__srcu_ctr_to_ptr(struct srcu_struct *ssp, int idx)
> > +{
> > +       return (struct srcu_ctr __percpu *)(intptr_t)idx;
> > +}
> > +
> > +static inline struct srcu_ctr __percpu *__srcu_read_lock_fast(struct srcu_struct *ssp)
> > +{
> > +       return __srcu_ctr_to_ptr(ssp, __srcu_read_lock(ssp));
> > +}
> > +
> > +static inline void __srcu_read_unlock_fast(struct srcu_struct *ssp, struct srcu_ctr __percpu *scp)
> > +{
> > +       __srcu_read_unlock(ssp, __srcu_ptr_to_ctr(ssp, scp));
> > +}
> > +
> >  #define __srcu_read_lock_lite __srcu_read_lock
> >  #define __srcu_read_unlock_lite __srcu_read_unlock
> >
> > diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
> > index ef3065c0cadcd..bdc467efce3a2 100644
> > --- a/include/linux/srcutree.h
> > +++ b/include/linux/srcutree.h
> > @@ -226,6 +226,44 @@ static inline struct srcu_ctr __percpu *__srcu_ctr_to_ptr(struct srcu_struct *ss
> >         return &ssp->sda->srcu_ctrs[idx];
> >  }
> >
> > +/*
> > + * Counts the new reader in the appropriate per-CPU element of the
> > + * srcu_struct.  Returns a pointer that must be passed to the matching
> > + * srcu_read_unlock_fast().
> > + *
> > + * Note that this_cpu_inc() is an RCU read-side critical section either
> > + * because it disables interrupts, because it is a single instruction,
> > + * or because it is a read-modify-write atomic operation, depending on
> > + * the whims of the architecture.
> > + */
> > +static inline struct srcu_ctr __percpu *__srcu_read_lock_fast(struct srcu_struct *ssp)
> > +{
> > +       struct srcu_ctr __percpu *scp = READ_ONCE(ssp->srcu_ctrp);
> > +
> > +       RCU_LOCKDEP_WARN(!rcu_is_watching(), "RCU must be watching srcu_read_lock_fast().");
> > +       this_cpu_inc(scp->srcu_locks.counter); /* Y */
> > +       barrier(); /* Avoid leaking the critical section. */
> > +       return scp;
> > +}
> > +
> > +/*
> > + * Removes the count for the old reader from the appropriate
> > + * per-CPU element of the srcu_struct.  Note that this may well be a
> > + * different CPU than that which was incremented by the corresponding
> > + * srcu_read_lock_fast(), but it must be within the same task.
> 
> hm... why the "same task" restriction? With uretprobes we take
> srcu_read_lock under a traced task, but we can "release" this lock
> from timer interrupt, which could be in the context of any task.

A kneejerk reaction on my part?  ;-)

But the good news is that this restriction is easy for me to relax.
I adjust the comments and remove the rcu_try_lock_acquire()
from srcu_read_lock_fast() and the srcu_lock_release() from
srcu_read_unlock_fast().

But in that case, I should rename this to srcu_down_read_fast() and
srcu_up_read_fast() or similar, as these names would clearly indicate
that they can be used cross-task.  (And from interrupt handlers, but
not from NMI handlers.)

Also, srcu_read_lock() and srcu_read_unlock() have srcu_lock_acquire() and
srcu_lock_release() calls, so I am surprised that lockdep didn't complain
when you invoked srcu_read_lock() from task context and srcu_read_unlock()
from a timer interrupt.

							Thanx, Paul

> > + *
> > + * Note that this_cpu_inc() is an RCU read-side critical section either
> > + * because it disables interrupts, because it is a single instruction,
> > + * or because it is a read-modify-write atomic operation, depending on
> > + * the whims of the architecture.
> > + */
> > +static inline void __srcu_read_unlock_fast(struct srcu_struct *ssp, struct srcu_ctr __percpu *scp)
> > +{
> > +       barrier();  /* Avoid leaking the critical section. */
> > +       this_cpu_inc(scp->srcu_unlocks.counter);  /* Z */
> > +       RCU_LOCKDEP_WARN(!rcu_is_watching(), "RCU must be watching srcu_read_unlock_fast().");
> > +}
> > +
> >  /*
> >   * Counts the new reader in the appropriate per-CPU element of the
> >   * srcu_struct.  Returns an index that must be passed to the matching
> > --
> > 2.40.1
> >
Andrii Nakryiko Jan. 16, 2025, 10:57 p.m. UTC | #6
On Thu, Jan 16, 2025 at 2:54 PM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Thu, Jan 16, 2025 at 01:52:53PM -0800, Andrii Nakryiko wrote:
> > On Thu, Jan 16, 2025 at 12:21 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> > >
> > > This commit adds srcu_read_{,un}lock_fast(), which is similar
> > > to srcu_read_{,un}lock_lite(), but avoids the array-indexing and
> > > pointer-following overhead.  On a microbenchmark featuring tight
> > > loops around empty readers, this results in about a 20% speedup
> > > compared to RCU Tasks Trace on my x86 laptop.
> > >
> > > Please note that SRCU-fast has drawbacks compared to RCU Tasks
> > > Trace, including:
> > >
> > > o       Lack of CPU stall warnings.
> > > o       SRCU-fast readers permitted only where rcu_is_watching().
> > > o       A pointer-sized return value from srcu_read_lock_fast() must
> > >         be passed to the corresponding srcu_read_unlock_fast().
> > > o       In the absence of readers, a synchronize_srcu() having _fast()
> > >         readers will incur the latency of at least two normal RCU grace
> > >         periods.
> > > o       RCU Tasks Trace priority boosting could be easily added.
> > >         Boosting SRCU readers is more difficult.
> > >
> > > SRCU-fast also has a drawback compared to SRCU-lite, namely that the
> > > return value from srcu_read_lock_fast()-fast is a 64-bit pointer and
> > > that from srcu_read_lock_lite() is only a 32-bit int.
> > >
> > > [ paulmck: Apply feedback from Akira Yokosawa. ]
> > >
> > > Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> > > Cc: Alexei Starovoitov <ast@kernel.org>
> > > Cc: Andrii Nakryiko <andrii@kernel.org>
> > > Cc: Peter Zijlstra <peterz@infradead.org>
> > > Cc: Kent Overstreet <kent.overstreet@linux.dev>
> > > Cc: <bpf@vger.kernel.org>
> > > ---
> > >  include/linux/srcu.h     | 47 ++++++++++++++++++++++++++++++++++++++--
> > >  include/linux/srcutiny.h | 22 +++++++++++++++++++
> > >  include/linux/srcutree.h | 38 ++++++++++++++++++++++++++++++++
> > >  3 files changed, 105 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/include/linux/srcu.h b/include/linux/srcu.h
> > > index 2bd0e24e9b554..63bddc3014238 100644
> > > --- a/include/linux/srcu.h
> > > +++ b/include/linux/srcu.h
> > > @@ -47,9 +47,10 @@ int init_srcu_struct(struct srcu_struct *ssp);
> > >  #define SRCU_READ_FLAVOR_NORMAL        0x1             // srcu_read_lock().
> > >  #define SRCU_READ_FLAVOR_NMI   0x2             // srcu_read_lock_nmisafe().
> > >  #define SRCU_READ_FLAVOR_LITE  0x4             // srcu_read_lock_lite().
> > > +#define SRCU_READ_FLAVOR_FAST  0x8             // srcu_read_lock_fast().
> > >  #define SRCU_READ_FLAVOR_ALL   (SRCU_READ_FLAVOR_NORMAL | SRCU_READ_FLAVOR_NMI | \
> > > -                               SRCU_READ_FLAVOR_LITE) // All of the above.
> > > -#define SRCU_READ_FLAVOR_SLOWGP        SRCU_READ_FLAVOR_LITE
> > > +                               SRCU_READ_FLAVOR_LITE | SRCU_READ_FLAVOR_FAST) // All of the above.
> > > +#define SRCU_READ_FLAVOR_SLOWGP        (SRCU_READ_FLAVOR_LITE | SRCU_READ_FLAVOR_FAST)
> > >                                                 // Flavors requiring synchronize_rcu()
> > >                                                 // instead of smp_mb().
> > >  void __srcu_read_unlock(struct srcu_struct *ssp, int idx) __releases(ssp);
> > > @@ -253,6 +254,33 @@ static inline int srcu_read_lock(struct srcu_struct *ssp) __acquires(ssp)
> > >         return retval;
> > >  }
> > >
> > > +/**
> > > + * srcu_read_lock_fast - register a new reader for an SRCU-protected structure.
> > > + * @ssp: srcu_struct in which to register the new reader.
> > > + *
> > > + * Enter an SRCU read-side critical section, but for a light-weight
> > > + * smp_mb()-free reader.  See srcu_read_lock() for more information.
> > > + *
> > > + * If srcu_read_lock_fast() is ever used on an srcu_struct structure,
> > > + * then none of the other flavors may be used, whether before, during,
> > > + * or after.  Note that grace-period auto-expediting is disabled for _fast
> > > + * srcu_struct structures because auto-expedited grace periods invoke
> > > + * synchronize_rcu_expedited(), IPIs and all.
> > > + *
> > > + * Note that srcu_read_lock_fast() can be invoked only from those contexts
> > > + * where RCU is watching, that is, from contexts where it would be legal
> > > + * to invoke rcu_read_lock().  Otherwise, lockdep will complain.
> > > + */
> > > +static inline struct srcu_ctr __percpu *srcu_read_lock_fast(struct srcu_struct *ssp) __acquires(ssp)
> > > +{
> > > +       struct srcu_ctr __percpu *retval;
> > > +
> > > +       srcu_check_read_flavor_force(ssp, SRCU_READ_FLAVOR_FAST);
> > > +       retval = __srcu_read_lock_fast(ssp);
> > > +       rcu_try_lock_acquire(&ssp->dep_map);
> > > +       return retval;
> > > +}
> > > +
> > >  /**
> > >   * srcu_read_lock_lite - register a new reader for an SRCU-protected structure.
> > >   * @ssp: srcu_struct in which to register the new reader.
> > > @@ -356,6 +384,21 @@ static inline void srcu_read_unlock(struct srcu_struct *ssp, int idx)
> > >         __srcu_read_unlock(ssp, idx);
> > >  }
> > >
> > > +/**
> > > + * srcu_read_unlock_fast - unregister a old reader from an SRCU-protected structure.
> > > + * @ssp: srcu_struct in which to unregister the old reader.
> > > + * @scp: return value from corresponding srcu_read_lock_fast().
> > > + *
> > > + * Exit a light-weight SRCU read-side critical section.
> > > + */
> > > +static inline void srcu_read_unlock_fast(struct srcu_struct *ssp, struct srcu_ctr __percpu *scp)
> > > +       __releases(ssp)
> > > +{
> > > +       srcu_check_read_flavor(ssp, SRCU_READ_FLAVOR_FAST);
> > > +       srcu_lock_release(&ssp->dep_map);
> > > +       __srcu_read_unlock_fast(ssp, scp);
> > > +}
> > > +
> > >  /**
> > >   * srcu_read_unlock_lite - unregister a old reader from an SRCU-protected structure.
> > >   * @ssp: srcu_struct in which to unregister the old reader.
> > > diff --git a/include/linux/srcutiny.h b/include/linux/srcutiny.h
> > > index 07a0c4489ea2f..380260317d98b 100644
> > > --- a/include/linux/srcutiny.h
> > > +++ b/include/linux/srcutiny.h
> > > @@ -71,6 +71,28 @@ static inline int __srcu_read_lock(struct srcu_struct *ssp)
> > >         return idx;
> > >  }
> > >
> > > +struct srcu_ctr;
> > > +
> > > +static inline bool __srcu_ptr_to_ctr(struct srcu_struct *ssp, struct srcu_ctr __percpu *scpp)
> > > +{
> > > +       return (int)(intptr_t)(struct srcu_ctr __force __kernel *)scpp;
> > > +}
> > > +
> > > +static inline struct srcu_ctr __percpu *__srcu_ctr_to_ptr(struct srcu_struct *ssp, int idx)
> > > +{
> > > +       return (struct srcu_ctr __percpu *)(intptr_t)idx;
> > > +}
> > > +
> > > +static inline struct srcu_ctr __percpu *__srcu_read_lock_fast(struct srcu_struct *ssp)
> > > +{
> > > +       return __srcu_ctr_to_ptr(ssp, __srcu_read_lock(ssp));
> > > +}
> > > +
> > > +static inline void __srcu_read_unlock_fast(struct srcu_struct *ssp, struct srcu_ctr __percpu *scp)
> > > +{
> > > +       __srcu_read_unlock(ssp, __srcu_ptr_to_ctr(ssp, scp));
> > > +}
> > > +
> > >  #define __srcu_read_lock_lite __srcu_read_lock
> > >  #define __srcu_read_unlock_lite __srcu_read_unlock
> > >
> > > diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
> > > index ef3065c0cadcd..bdc467efce3a2 100644
> > > --- a/include/linux/srcutree.h
> > > +++ b/include/linux/srcutree.h
> > > @@ -226,6 +226,44 @@ static inline struct srcu_ctr __percpu *__srcu_ctr_to_ptr(struct srcu_struct *ss
> > >         return &ssp->sda->srcu_ctrs[idx];
> > >  }
> > >
> > > +/*
> > > + * Counts the new reader in the appropriate per-CPU element of the
> > > + * srcu_struct.  Returns a pointer that must be passed to the matching
> > > + * srcu_read_unlock_fast().
> > > + *
> > > + * Note that this_cpu_inc() is an RCU read-side critical section either
> > > + * because it disables interrupts, because it is a single instruction,
> > > + * or because it is a read-modify-write atomic operation, depending on
> > > + * the whims of the architecture.
> > > + */
> > > +static inline struct srcu_ctr __percpu *__srcu_read_lock_fast(struct srcu_struct *ssp)
> > > +{
> > > +       struct srcu_ctr __percpu *scp = READ_ONCE(ssp->srcu_ctrp);
> > > +
> > > +       RCU_LOCKDEP_WARN(!rcu_is_watching(), "RCU must be watching srcu_read_lock_fast().");
> > > +       this_cpu_inc(scp->srcu_locks.counter); /* Y */
> > > +       barrier(); /* Avoid leaking the critical section. */
> > > +       return scp;
> > > +}
> > > +
> > > +/*
> > > + * Removes the count for the old reader from the appropriate
> > > + * per-CPU element of the srcu_struct.  Note that this may well be a
> > > + * different CPU than that which was incremented by the corresponding
> > > + * srcu_read_lock_fast(), but it must be within the same task.
> >
> > hm... why the "same task" restriction? With uretprobes we take
> > srcu_read_lock under a traced task, but we can "release" this lock
> > from timer interrupt, which could be in the context of any task.
>
> A kneejerk reaction on my part?  ;-)
>
> But the good news is that this restriction is easy for me to relax.
> I adjust the comments and remove the rcu_try_lock_acquire()
> from srcu_read_lock_fast() and the srcu_lock_release() from
> srcu_read_unlock_fast().
>
> But in that case, I should rename this to srcu_down_read_fast() and
> srcu_up_read_fast() or similar, as these names would clearly indicate
> that they can be used cross-task.  (And from interrupt handlers, but
> not from NMI handlers.)
>
> Also, srcu_read_lock() and srcu_read_unlock() have srcu_lock_acquire() and
> srcu_lock_release() calls, so I am surprised that lockdep didn't complain
> when you invoked srcu_read_lock() from task context and srcu_read_unlock()
> from a timer interrupt.

That's because I'm using __srcu_read_lock and __srcu_read_unlock ;)
See this part in kernel/events/uprobes.c:

/* __srcu_read_lock() because SRCU lock survives switch to user space */
srcu_idx = __srcu_read_lock(&uretprobes_srcu);

>
>                                                         Thanx, Paul
>
> > > + *
> > > + * Note that this_cpu_inc() is an RCU read-side critical section either
> > > + * because it disables interrupts, because it is a single instruction,
> > > + * or because it is a read-modify-write atomic operation, depending on
> > > + * the whims of the architecture.
> > > + */
> > > +static inline void __srcu_read_unlock_fast(struct srcu_struct *ssp, struct srcu_ctr __percpu *scp)
> > > +{
> > > +       barrier();  /* Avoid leaking the critical section. */
> > > +       this_cpu_inc(scp->srcu_unlocks.counter);  /* Z */
> > > +       RCU_LOCKDEP_WARN(!rcu_is_watching(), "RCU must be watching srcu_read_unlock_fast().");
> > > +}
> > > +
> > >  /*
> > >   * Counts the new reader in the appropriate per-CPU element of the
> > >   * srcu_struct.  Returns an index that must be passed to the matching
> > > --
> > > 2.40.1
> > >
Alexei Starovoitov Jan. 16, 2025, 10:58 p.m. UTC | #7
On Thu, Jan 16, 2025 at 1:55 PM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Thu, Jan 16, 2025 at 01:00:24PM -0800, Alexei Starovoitov wrote:
> > On Thu, Jan 16, 2025 at 12:21 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> > >
> > > +/*
> > > + * Counts the new reader in the appropriate per-CPU element of the
> > > + * srcu_struct.  Returns a pointer that must be passed to the matching
> > > + * srcu_read_unlock_fast().
> > > + *
> > > + * Note that this_cpu_inc() is an RCU read-side critical section either
> > > + * because it disables interrupts, because it is a single instruction,
> > > + * or because it is a read-modify-write atomic operation, depending on
> > > + * the whims of the architecture.
> > > + */
> > > +static inline struct srcu_ctr __percpu *__srcu_read_lock_fast(struct srcu_struct *ssp)
> > > +{
> > > +       struct srcu_ctr __percpu *scp = READ_ONCE(ssp->srcu_ctrp);
> > > +
> > > +       RCU_LOCKDEP_WARN(!rcu_is_watching(), "RCU must be watching srcu_read_lock_fast().");
> > > +       this_cpu_inc(scp->srcu_locks.counter); /* Y */
> > > +       barrier(); /* Avoid leaking the critical section. */
> > > +       return scp;
> > > +}
> >
> > This doesn't look fast.
> > If I'm reading this correctly,
> > even with debugs off RCU_LOCKDEP_WARN() will still call
> > rcu_is_watching() and this doesn't look cheap or fast.
>
> Here is the CONFIG_PROVE_RCU=n definition:
>
>         #define RCU_LOCKDEP_WARN(c, s) do { } while (0 && (c))
>
> The "0" in the "0 && (c)" should prevent that call to rcu_is_watching().
>
> But why not see what the compiler thinks?  I added the following function
> to kernel/rcu/srcutree.c:
>
> struct srcu_ctr __percpu *test_srcu_read_lock_fast(struct srcu_struct *ssp)
> {
>         struct srcu_ctr __percpu *p;
>
>         p = srcu_read_lock_fast(ssp);
>         return p;
> }
>
> This function compiles to the following code:
>
> Dump of assembler code for function test_srcu_read_lock_fast:
>    0xffffffff811220c0 <+0>:     endbr64
>    0xffffffff811220c4 <+4>:     sub    $0x8,%rsp
>    0xffffffff811220c8 <+8>:     mov    0x8(%rdi),%rax
>    0xffffffff811220cc <+12>:    add    %gs:0x7eef3944(%rip),%rax        # 0x15a18 <this_cpu_off>
>    0xffffffff811220d4 <+20>:    mov    0x20(%rax),%eax
>    0xffffffff811220d7 <+23>:    test   $0x8,%al
>    0xffffffff811220d9 <+25>:    je     0xffffffff811220eb <test_srcu_read_lock_fast+43>
>    0xffffffff811220db <+27>:    mov    (%rdi),%rax
>    0xffffffff811220de <+30>:    incq   %gs:(%rax)
>    0xffffffff811220e2 <+34>:    add    $0x8,%rsp
>    0xffffffff811220e6 <+38>:    jmp    0xffffffff81f5fe60 <__x86_return_thunk>
>    0xffffffff811220eb <+43>:    mov    $0x8,%esi
>    0xffffffff811220f0 <+48>:    mov    %rdi,(%rsp)
>    0xffffffff811220f4 <+52>:    call   0xffffffff8111fb90 <__srcu_check_read_flavor>
>    0xffffffff811220f9 <+57>:    mov    (%rsp),%rdi
>    0xffffffff811220fd <+61>:    jmp    0xffffffff811220db <test_srcu_read_lock_fast+27>
>
> The first call to srcu_read_lock_fast() invokes __srcu_check_read_flavor(),
> but after that the "je" instruction will fall through.  So the common-case
> code path executes only the part of this function up to and including the
> "jmp 0xffffffff81f5fe60 <__x86_return_thunk>".
>
> Does that serve?

Thanks for checking.
I was worried that in case of:
while (0 && (c))
the compiler might need to still emit (c) since it cannot prove
that there are no side effects there.
I vaguely recall now that C standard allows to ignore 2nd expression
when the first expression satisfies the boolean condition.
Paul E. McKenney Jan. 17, 2025, 12:07 a.m. UTC | #8
On Thu, Jan 16, 2025 at 02:57:25PM -0800, Andrii Nakryiko wrote:
> On Thu, Jan 16, 2025 at 2:54 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > On Thu, Jan 16, 2025 at 01:52:53PM -0800, Andrii Nakryiko wrote:
> > > On Thu, Jan 16, 2025 at 12:21 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> > > >
> > > > This commit adds srcu_read_{,un}lock_fast(), which is similar
> > > > to srcu_read_{,un}lock_lite(), but avoids the array-indexing and
> > > > pointer-following overhead.  On a microbenchmark featuring tight
> > > > loops around empty readers, this results in about a 20% speedup
> > > > compared to RCU Tasks Trace on my x86 laptop.
> > > >
> > > > Please note that SRCU-fast has drawbacks compared to RCU Tasks
> > > > Trace, including:
> > > >
> > > > o       Lack of CPU stall warnings.
> > > > o       SRCU-fast readers permitted only where rcu_is_watching().
> > > > o       A pointer-sized return value from srcu_read_lock_fast() must
> > > >         be passed to the corresponding srcu_read_unlock_fast().
> > > > o       In the absence of readers, a synchronize_srcu() having _fast()
> > > >         readers will incur the latency of at least two normal RCU grace
> > > >         periods.
> > > > o       RCU Tasks Trace priority boosting could be easily added.
> > > >         Boosting SRCU readers is more difficult.
> > > >
> > > > SRCU-fast also has a drawback compared to SRCU-lite, namely that the
> > > > return value from srcu_read_lock_fast()-fast is a 64-bit pointer and
> > > > that from srcu_read_lock_lite() is only a 32-bit int.
> > > >
> > > > [ paulmck: Apply feedback from Akira Yokosawa. ]
> > > >
> > > > Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> > > > Cc: Alexei Starovoitov <ast@kernel.org>
> > > > Cc: Andrii Nakryiko <andrii@kernel.org>
> > > > Cc: Peter Zijlstra <peterz@infradead.org>
> > > > Cc: Kent Overstreet <kent.overstreet@linux.dev>
> > > > Cc: <bpf@vger.kernel.org>
> > > > ---
> > > >  include/linux/srcu.h     | 47 ++++++++++++++++++++++++++++++++++++++--
> > > >  include/linux/srcutiny.h | 22 +++++++++++++++++++
> > > >  include/linux/srcutree.h | 38 ++++++++++++++++++++++++++++++++
> > > >  3 files changed, 105 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/include/linux/srcu.h b/include/linux/srcu.h
> > > > index 2bd0e24e9b554..63bddc3014238 100644
> > > > --- a/include/linux/srcu.h
> > > > +++ b/include/linux/srcu.h
> > > > @@ -47,9 +47,10 @@ int init_srcu_struct(struct srcu_struct *ssp);
> > > >  #define SRCU_READ_FLAVOR_NORMAL        0x1             // srcu_read_lock().
> > > >  #define SRCU_READ_FLAVOR_NMI   0x2             // srcu_read_lock_nmisafe().
> > > >  #define SRCU_READ_FLAVOR_LITE  0x4             // srcu_read_lock_lite().
> > > > +#define SRCU_READ_FLAVOR_FAST  0x8             // srcu_read_lock_fast().
> > > >  #define SRCU_READ_FLAVOR_ALL   (SRCU_READ_FLAVOR_NORMAL | SRCU_READ_FLAVOR_NMI | \
> > > > -                               SRCU_READ_FLAVOR_LITE) // All of the above.
> > > > -#define SRCU_READ_FLAVOR_SLOWGP        SRCU_READ_FLAVOR_LITE
> > > > +                               SRCU_READ_FLAVOR_LITE | SRCU_READ_FLAVOR_FAST) // All of the above.
> > > > +#define SRCU_READ_FLAVOR_SLOWGP        (SRCU_READ_FLAVOR_LITE | SRCU_READ_FLAVOR_FAST)
> > > >                                                 // Flavors requiring synchronize_rcu()
> > > >                                                 // instead of smp_mb().
> > > >  void __srcu_read_unlock(struct srcu_struct *ssp, int idx) __releases(ssp);
> > > > @@ -253,6 +254,33 @@ static inline int srcu_read_lock(struct srcu_struct *ssp) __acquires(ssp)
> > > >         return retval;
> > > >  }
> > > >
> > > > +/**
> > > > + * srcu_read_lock_fast - register a new reader for an SRCU-protected structure.
> > > > + * @ssp: srcu_struct in which to register the new reader.
> > > > + *
> > > > + * Enter an SRCU read-side critical section, but for a light-weight
> > > > + * smp_mb()-free reader.  See srcu_read_lock() for more information.
> > > > + *
> > > > + * If srcu_read_lock_fast() is ever used on an srcu_struct structure,
> > > > + * then none of the other flavors may be used, whether before, during,
> > > > + * or after.  Note that grace-period auto-expediting is disabled for _fast
> > > > + * srcu_struct structures because auto-expedited grace periods invoke
> > > > + * synchronize_rcu_expedited(), IPIs and all.
> > > > + *
> > > > + * Note that srcu_read_lock_fast() can be invoked only from those contexts
> > > > + * where RCU is watching, that is, from contexts where it would be legal
> > > > + * to invoke rcu_read_lock().  Otherwise, lockdep will complain.
> > > > + */
> > > > +static inline struct srcu_ctr __percpu *srcu_read_lock_fast(struct srcu_struct *ssp) __acquires(ssp)
> > > > +{
> > > > +       struct srcu_ctr __percpu *retval;
> > > > +
> > > > +       srcu_check_read_flavor_force(ssp, SRCU_READ_FLAVOR_FAST);
> > > > +       retval = __srcu_read_lock_fast(ssp);
> > > > +       rcu_try_lock_acquire(&ssp->dep_map);
> > > > +       return retval;
> > > > +}
> > > > +
> > > >  /**
> > > >   * srcu_read_lock_lite - register a new reader for an SRCU-protected structure.
> > > >   * @ssp: srcu_struct in which to register the new reader.
> > > > @@ -356,6 +384,21 @@ static inline void srcu_read_unlock(struct srcu_struct *ssp, int idx)
> > > >         __srcu_read_unlock(ssp, idx);
> > > >  }
> > > >
> > > > +/**
> > > > + * srcu_read_unlock_fast - unregister a old reader from an SRCU-protected structure.
> > > > + * @ssp: srcu_struct in which to unregister the old reader.
> > > > + * @scp: return value from corresponding srcu_read_lock_fast().
> > > > + *
> > > > + * Exit a light-weight SRCU read-side critical section.
> > > > + */
> > > > +static inline void srcu_read_unlock_fast(struct srcu_struct *ssp, struct srcu_ctr __percpu *scp)
> > > > +       __releases(ssp)
> > > > +{
> > > > +       srcu_check_read_flavor(ssp, SRCU_READ_FLAVOR_FAST);
> > > > +       srcu_lock_release(&ssp->dep_map);
> > > > +       __srcu_read_unlock_fast(ssp, scp);
> > > > +}
> > > > +
> > > >  /**
> > > >   * srcu_read_unlock_lite - unregister a old reader from an SRCU-protected structure.
> > > >   * @ssp: srcu_struct in which to unregister the old reader.
> > > > diff --git a/include/linux/srcutiny.h b/include/linux/srcutiny.h
> > > > index 07a0c4489ea2f..380260317d98b 100644
> > > > --- a/include/linux/srcutiny.h
> > > > +++ b/include/linux/srcutiny.h
> > > > @@ -71,6 +71,28 @@ static inline int __srcu_read_lock(struct srcu_struct *ssp)
> > > >         return idx;
> > > >  }
> > > >
> > > > +struct srcu_ctr;
> > > > +
> > > > +static inline bool __srcu_ptr_to_ctr(struct srcu_struct *ssp, struct srcu_ctr __percpu *scpp)
> > > > +{
> > > > +       return (int)(intptr_t)(struct srcu_ctr __force __kernel *)scpp;
> > > > +}
> > > > +
> > > > +static inline struct srcu_ctr __percpu *__srcu_ctr_to_ptr(struct srcu_struct *ssp, int idx)
> > > > +{
> > > > +       return (struct srcu_ctr __percpu *)(intptr_t)idx;
> > > > +}
> > > > +
> > > > +static inline struct srcu_ctr __percpu *__srcu_read_lock_fast(struct srcu_struct *ssp)
> > > > +{
> > > > +       return __srcu_ctr_to_ptr(ssp, __srcu_read_lock(ssp));
> > > > +}
> > > > +
> > > > +static inline void __srcu_read_unlock_fast(struct srcu_struct *ssp, struct srcu_ctr __percpu *scp)
> > > > +{
> > > > +       __srcu_read_unlock(ssp, __srcu_ptr_to_ctr(ssp, scp));
> > > > +}
> > > > +
> > > >  #define __srcu_read_lock_lite __srcu_read_lock
> > > >  #define __srcu_read_unlock_lite __srcu_read_unlock
> > > >
> > > > diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
> > > > index ef3065c0cadcd..bdc467efce3a2 100644
> > > > --- a/include/linux/srcutree.h
> > > > +++ b/include/linux/srcutree.h
> > > > @@ -226,6 +226,44 @@ static inline struct srcu_ctr __percpu *__srcu_ctr_to_ptr(struct srcu_struct *ss
> > > >         return &ssp->sda->srcu_ctrs[idx];
> > > >  }
> > > >
> > > > +/*
> > > > + * Counts the new reader in the appropriate per-CPU element of the
> > > > + * srcu_struct.  Returns a pointer that must be passed to the matching
> > > > + * srcu_read_unlock_fast().
> > > > + *
> > > > + * Note that this_cpu_inc() is an RCU read-side critical section either
> > > > + * because it disables interrupts, because it is a single instruction,
> > > > + * or because it is a read-modify-write atomic operation, depending on
> > > > + * the whims of the architecture.
> > > > + */
> > > > +static inline struct srcu_ctr __percpu *__srcu_read_lock_fast(struct srcu_struct *ssp)
> > > > +{
> > > > +       struct srcu_ctr __percpu *scp = READ_ONCE(ssp->srcu_ctrp);
> > > > +
> > > > +       RCU_LOCKDEP_WARN(!rcu_is_watching(), "RCU must be watching srcu_read_lock_fast().");
> > > > +       this_cpu_inc(scp->srcu_locks.counter); /* Y */
> > > > +       barrier(); /* Avoid leaking the critical section. */
> > > > +       return scp;
> > > > +}
> > > > +
> > > > +/*
> > > > + * Removes the count for the old reader from the appropriate
> > > > + * per-CPU element of the srcu_struct.  Note that this may well be a
> > > > + * different CPU than that which was incremented by the corresponding
> > > > + * srcu_read_lock_fast(), but it must be within the same task.
> > >
> > > hm... why the "same task" restriction? With uretprobes we take
> > > srcu_read_lock under a traced task, but we can "release" this lock
> > > from timer interrupt, which could be in the context of any task.
> >
> > A kneejerk reaction on my part?  ;-)
> >
> > But the good news is that this restriction is easy for me to relax.
> > I adjust the comments and remove the rcu_try_lock_acquire()
> > from srcu_read_lock_fast() and the srcu_lock_release() from
> > srcu_read_unlock_fast().
> >
> > But in that case, I should rename this to srcu_down_read_fast() and
> > srcu_up_read_fast() or similar, as these names would clearly indicate
> > that they can be used cross-task.  (And from interrupt handlers, but
> > not from NMI handlers.)
> >
> > Also, srcu_read_lock() and srcu_read_unlock() have srcu_lock_acquire() and
> > srcu_lock_release() calls, so I am surprised that lockdep didn't complain
> > when you invoked srcu_read_lock() from task context and srcu_read_unlock()
> > from a timer interrupt.
> 
> That's because I'm using __srcu_read_lock and __srcu_read_unlock ;)
> See this part in kernel/events/uprobes.c:
> 
> /* __srcu_read_lock() because SRCU lock survives switch to user space */
> srcu_idx = __srcu_read_lock(&uretprobes_srcu);

I clearly should have read that code more carefully!  ;-)

This happens to work for srcu_read_lock() and srcu_read_lock_nmisafe(),
but invoking __srcu_read_lock_lite() or __srcu_read_lock_fast() would fail
(rarely!) with too-short grace periods.

So thank you for catching this!  I will adjust the name and semantics.

                                                        Thanx, Paul

> > > > + *
> > > > + * Note that this_cpu_inc() is an RCU read-side critical section either
> > > > + * because it disables interrupts, because it is a single instruction,
> > > > + * or because it is a read-modify-write atomic operation, depending on
> > > > + * the whims of the architecture.
> > > > + */
> > > > +static inline void __srcu_read_unlock_fast(struct srcu_struct *ssp, struct srcu_ctr __percpu *scp)
> > > > +{
> > > > +       barrier();  /* Avoid leaking the critical section. */
> > > > +       this_cpu_inc(scp->srcu_unlocks.counter);  /* Z */
> > > > +       RCU_LOCKDEP_WARN(!rcu_is_watching(), "RCU must be watching srcu_read_unlock_fast().");
> > > > +}
> > > > +
> > > >  /*
> > > >   * Counts the new reader in the appropriate per-CPU element of the
> > > >   * srcu_struct.  Returns an index that must be passed to the matching
> > > > --
> > > > 2.40.1
> > > >
diff mbox series

Patch

diff --git a/include/linux/srcu.h b/include/linux/srcu.h
index 2bd0e24e9b554..63bddc3014238 100644
--- a/include/linux/srcu.h
+++ b/include/linux/srcu.h
@@ -47,9 +47,10 @@  int init_srcu_struct(struct srcu_struct *ssp);
 #define SRCU_READ_FLAVOR_NORMAL	0x1		// srcu_read_lock().
 #define SRCU_READ_FLAVOR_NMI	0x2		// srcu_read_lock_nmisafe().
 #define SRCU_READ_FLAVOR_LITE	0x4		// srcu_read_lock_lite().
+#define SRCU_READ_FLAVOR_FAST	0x8		// srcu_read_lock_fast().
 #define SRCU_READ_FLAVOR_ALL   (SRCU_READ_FLAVOR_NORMAL | SRCU_READ_FLAVOR_NMI | \
-				SRCU_READ_FLAVOR_LITE) // All of the above.
-#define SRCU_READ_FLAVOR_SLOWGP	SRCU_READ_FLAVOR_LITE
+				SRCU_READ_FLAVOR_LITE | SRCU_READ_FLAVOR_FAST) // All of the above.
+#define SRCU_READ_FLAVOR_SLOWGP	(SRCU_READ_FLAVOR_LITE | SRCU_READ_FLAVOR_FAST)
 						// Flavors requiring synchronize_rcu()
 						// instead of smp_mb().
 void __srcu_read_unlock(struct srcu_struct *ssp, int idx) __releases(ssp);
@@ -253,6 +254,33 @@  static inline int srcu_read_lock(struct srcu_struct *ssp) __acquires(ssp)
 	return retval;
 }
 
+/**
+ * srcu_read_lock_fast - register a new reader for an SRCU-protected structure.
+ * @ssp: srcu_struct in which to register the new reader.
+ *
+ * Enter an SRCU read-side critical section, but for a light-weight
+ * smp_mb()-free reader.  See srcu_read_lock() for more information.
+ *
+ * If srcu_read_lock_fast() is ever used on an srcu_struct structure,
+ * then none of the other flavors may be used, whether before, during,
+ * or after.  Note that grace-period auto-expediting is disabled for _fast
+ * srcu_struct structures because auto-expedited grace periods invoke
+ * synchronize_rcu_expedited(), IPIs and all.
+ *
+ * Note that srcu_read_lock_fast() can be invoked only from those contexts
+ * where RCU is watching, that is, from contexts where it would be legal
+ * to invoke rcu_read_lock().  Otherwise, lockdep will complain.
+ */
+static inline struct srcu_ctr __percpu *srcu_read_lock_fast(struct srcu_struct *ssp) __acquires(ssp)
+{
+	struct srcu_ctr __percpu *retval;
+
+	srcu_check_read_flavor_force(ssp, SRCU_READ_FLAVOR_FAST);
+	retval = __srcu_read_lock_fast(ssp);
+	rcu_try_lock_acquire(&ssp->dep_map);
+	return retval;
+}
+
 /**
  * srcu_read_lock_lite - register a new reader for an SRCU-protected structure.
  * @ssp: srcu_struct in which to register the new reader.
@@ -356,6 +384,21 @@  static inline void srcu_read_unlock(struct srcu_struct *ssp, int idx)
 	__srcu_read_unlock(ssp, idx);
 }
 
+/**
+ * srcu_read_unlock_fast - unregister a old reader from an SRCU-protected structure.
+ * @ssp: srcu_struct in which to unregister the old reader.
+ * @scp: return value from corresponding srcu_read_lock_fast().
+ *
+ * Exit a light-weight SRCU read-side critical section.
+ */
+static inline void srcu_read_unlock_fast(struct srcu_struct *ssp, struct srcu_ctr __percpu *scp)
+	__releases(ssp)
+{
+	srcu_check_read_flavor(ssp, SRCU_READ_FLAVOR_FAST);
+	srcu_lock_release(&ssp->dep_map);
+	__srcu_read_unlock_fast(ssp, scp);
+}
+
 /**
  * srcu_read_unlock_lite - unregister a old reader from an SRCU-protected structure.
  * @ssp: srcu_struct in which to unregister the old reader.
diff --git a/include/linux/srcutiny.h b/include/linux/srcutiny.h
index 07a0c4489ea2f..380260317d98b 100644
--- a/include/linux/srcutiny.h
+++ b/include/linux/srcutiny.h
@@ -71,6 +71,28 @@  static inline int __srcu_read_lock(struct srcu_struct *ssp)
 	return idx;
 }
 
+struct srcu_ctr;
+
+static inline bool __srcu_ptr_to_ctr(struct srcu_struct *ssp, struct srcu_ctr __percpu *scpp)
+{
+	return (int)(intptr_t)(struct srcu_ctr __force __kernel *)scpp;
+}
+
+static inline struct srcu_ctr __percpu *__srcu_ctr_to_ptr(struct srcu_struct *ssp, int idx)
+{
+	return (struct srcu_ctr __percpu *)(intptr_t)idx;
+}
+
+static inline struct srcu_ctr __percpu *__srcu_read_lock_fast(struct srcu_struct *ssp)
+{
+	return __srcu_ctr_to_ptr(ssp, __srcu_read_lock(ssp));
+}
+
+static inline void __srcu_read_unlock_fast(struct srcu_struct *ssp, struct srcu_ctr __percpu *scp)
+{
+	__srcu_read_unlock(ssp, __srcu_ptr_to_ctr(ssp, scp));
+}
+
 #define __srcu_read_lock_lite __srcu_read_lock
 #define __srcu_read_unlock_lite __srcu_read_unlock
 
diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index ef3065c0cadcd..bdc467efce3a2 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -226,6 +226,44 @@  static inline struct srcu_ctr __percpu *__srcu_ctr_to_ptr(struct srcu_struct *ss
 	return &ssp->sda->srcu_ctrs[idx];
 }
 
+/*
+ * Counts the new reader in the appropriate per-CPU element of the
+ * srcu_struct.  Returns a pointer that must be passed to the matching
+ * srcu_read_unlock_fast().
+ *
+ * Note that this_cpu_inc() is an RCU read-side critical section either
+ * because it disables interrupts, because it is a single instruction,
+ * or because it is a read-modify-write atomic operation, depending on
+ * the whims of the architecture.
+ */
+static inline struct srcu_ctr __percpu *__srcu_read_lock_fast(struct srcu_struct *ssp)
+{
+	struct srcu_ctr __percpu *scp = READ_ONCE(ssp->srcu_ctrp);
+
+	RCU_LOCKDEP_WARN(!rcu_is_watching(), "RCU must be watching srcu_read_lock_fast().");
+	this_cpu_inc(scp->srcu_locks.counter); /* Y */
+	barrier(); /* Avoid leaking the critical section. */
+	return scp;
+}
+
+/*
+ * Removes the count for the old reader from the appropriate
+ * per-CPU element of the srcu_struct.  Note that this may well be a
+ * different CPU than that which was incremented by the corresponding
+ * srcu_read_lock_fast(), but it must be within the same task.
+ *
+ * Note that this_cpu_inc() is an RCU read-side critical section either
+ * because it disables interrupts, because it is a single instruction,
+ * or because it is a read-modify-write atomic operation, depending on
+ * the whims of the architecture.
+ */
+static inline void __srcu_read_unlock_fast(struct srcu_struct *ssp, struct srcu_ctr __percpu *scp)
+{
+	barrier();  /* Avoid leaking the critical section. */
+	this_cpu_inc(scp->srcu_unlocks.counter);  /* Z */
+	RCU_LOCKDEP_WARN(!rcu_is_watching(), "RCU must be watching srcu_read_unlock_fast().");
+}
+
 /*
  * Counts the new reader in the appropriate per-CPU element of the
  * srcu_struct.  Returns an index that must be passed to the matching