Message ID | 20220929180731.2875722-1-paulmck@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | NMI-safe SRCU reader API | expand |
Hi Paul, On 2022-09-29, "Paul E. McKenney" <paulmck@kernel.org> wrote: > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c > index 1c304fec89c0..6fd0665f4d1f 100644 > --- a/kernel/rcu/srcutree.c > +++ b/kernel/rcu/srcutree.c > @@ -636,7 +636,7 @@ int __srcu_read_lock(struct srcu_struct *ssp) > int idx; > > idx = READ_ONCE(ssp->srcu_idx) & 0x1; > - this_cpu_inc(ssp->sda->srcu_lock_count[idx]); > + this_cpu_inc(ssp->sda->srcu_lock_count[idx].counter); > smp_mb(); /* B */ /* Avoid leaking the critical section. */ > return idx; > } Is there any particular reason that you are directly modifying @counter instead of raw_cpu_ptr()+atomic_long_inc() that do you in __srcu_read_lock_nmisafe() of patch 2? > @@ -650,7 +650,7 @@ EXPORT_SYMBOL_GPL(__srcu_read_lock); > void __srcu_read_unlock(struct srcu_struct *ssp, int idx) > { > smp_mb(); /* C */ /* Avoid leaking the critical section. */ > - this_cpu_inc(ssp->sda->srcu_unlock_count[idx]); > + this_cpu_inc(ssp->sda->srcu_unlock_count[idx].counter); > } > EXPORT_SYMBOL_GPL(__srcu_read_unlock); Ditto. > @@ -1687,8 +1687,8 @@ void srcu_torture_stats_print(struct srcu_struct *ssp, char *tt, char *tf) > struct srcu_data *sdp; > > sdp = per_cpu_ptr(ssp->sda, cpu); > - u0 = data_race(sdp->srcu_unlock_count[!idx]); > - u1 = data_race(sdp->srcu_unlock_count[idx]); > + u0 = data_race(sdp->srcu_unlock_count[!idx].counter); > + u1 = data_race(sdp->srcu_unlock_count[idx].counter); > > /* > * Make sure that a lock is always counted if the corresponding And instead of atomic_long_read(). > @@ -1696,8 +1696,8 @@ void srcu_torture_stats_print(struct srcu_struct *ssp, char *tt, char *tf) > */ > smp_rmb(); > > - l0 = data_race(sdp->srcu_lock_count[!idx]); > - l1 = data_race(sdp->srcu_lock_count[idx]); > + l0 = data_race(sdp->srcu_lock_count[!idx].counter); > + l1 = data_race(sdp->srcu_lock_count[idx].counter); > > c0 = l0 - u0; > c1 = l1 - u1; Ditto. John Ogness
On Fri, Sep 30, 2022 at 05:08:18PM +0206, John Ogness wrote: > Hi Paul, > > On 2022-09-29, "Paul E. McKenney" <paulmck@kernel.org> wrote: > > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c > > index 1c304fec89c0..6fd0665f4d1f 100644 > > --- a/kernel/rcu/srcutree.c > > +++ b/kernel/rcu/srcutree.c > > @@ -636,7 +636,7 @@ int __srcu_read_lock(struct srcu_struct *ssp) > > int idx; > > > > idx = READ_ONCE(ssp->srcu_idx) & 0x1; > > - this_cpu_inc(ssp->sda->srcu_lock_count[idx]); > > + this_cpu_inc(ssp->sda->srcu_lock_count[idx].counter); > > smp_mb(); /* B */ /* Avoid leaking the critical section. */ > > return idx; > > } > > Is there any particular reason that you are directly modifying @counter > instead of raw_cpu_ptr()+atomic_long_inc() that do you in > __srcu_read_lock_nmisafe() of patch 2? Performance. From what I can see, this_cpu_inc() is way faster than atomic_long_inc() on x86 and s390. Maybe also on loongarch. No idea on arm64. > > @@ -650,7 +650,7 @@ EXPORT_SYMBOL_GPL(__srcu_read_lock); > > void __srcu_read_unlock(struct srcu_struct *ssp, int idx) > > { > > smp_mb(); /* C */ /* Avoid leaking the critical section. */ > > - this_cpu_inc(ssp->sda->srcu_unlock_count[idx]); > > + this_cpu_inc(ssp->sda->srcu_unlock_count[idx].counter); > > } > > EXPORT_SYMBOL_GPL(__srcu_read_unlock); > > Ditto. Ditto back at you! ;-) > > @@ -1687,8 +1687,8 @@ void srcu_torture_stats_print(struct srcu_struct *ssp, char *tt, char *tf) > > struct srcu_data *sdp; > > > > sdp = per_cpu_ptr(ssp->sda, cpu); > > - u0 = data_race(sdp->srcu_unlock_count[!idx]); > > - u1 = data_race(sdp->srcu_unlock_count[idx]); > > + u0 = data_race(sdp->srcu_unlock_count[!idx].counter); > > + u1 = data_race(sdp->srcu_unlock_count[idx].counter); > > > > /* > > * Make sure that a lock is always counted if the corresponding > > And instead of atomic_long_read(). You are right, here I could just as well use atomic_long_read(). > > @@ -1696,8 +1696,8 @@ void srcu_torture_stats_print(struct srcu_struct *ssp, char *tt, char *tf) > > */ > > smp_rmb(); > > > > - l0 = data_race(sdp->srcu_lock_count[!idx]); > > - l1 = data_race(sdp->srcu_lock_count[idx]); > > + l0 = data_race(sdp->srcu_lock_count[!idx].counter); > > + l1 = data_race(sdp->srcu_lock_count[idx].counter); > > > > c0 = l0 - u0; > > c1 = l1 - u1; > > Ditto. And here as well. ;-) I will fix these, and thank you for looking this over! Thanx, Paul
On 2022-09-30, "Paul E. McKenney" <paulmck@kernel.org> wrote: >> > - this_cpu_inc(ssp->sda->srcu_lock_count[idx]); >> > + this_cpu_inc(ssp->sda->srcu_lock_count[idx].counter); >> >> Is there any particular reason that you are directly modifying >> @counter instead of raw_cpu_ptr()+atomic_long_inc() that do you in >> __srcu_read_lock_nmisafe() of patch 2? > > Performance. From what I can see, this_cpu_inc() is way faster than > atomic_long_inc() on x86 and s390. Maybe also on loongarch. No idea > on arm64. Yeah, that's what I figured. I just wanted to make sure. FWIW, the rest of the series looks pretty straight forward to me. John
On Fri, Sep 30, 2022 at 10:43:43PM +0206, John Ogness wrote: > On 2022-09-30, "Paul E. McKenney" <paulmck@kernel.org> wrote: > >> > - this_cpu_inc(ssp->sda->srcu_lock_count[idx]); > >> > + this_cpu_inc(ssp->sda->srcu_lock_count[idx].counter); > >> > >> Is there any particular reason that you are directly modifying > >> @counter instead of raw_cpu_ptr()+atomic_long_inc() that do you in > >> __srcu_read_lock_nmisafe() of patch 2? > > > > Performance. From what I can see, this_cpu_inc() is way faster than > > atomic_long_inc() on x86 and s390. Maybe also on loongarch. No idea > > on arm64. > > Yeah, that's what I figured. I just wanted to make sure. > > FWIW, the rest of the series looks pretty straight forward to me. Thank you for looking it over! The updated patch is shown below. The full series is in -rcu at branch srcunmisafe.2022.09.30a. Feel free to pull it into your printk() series if that makes things easier for you. Thanx, Paul ------------------------------------------------------------------------ commit 24511d0b754db760d4e1a08fc48a180f6a5a948b Author: Paul E. McKenney <paulmck@kernel.org> Date: Thu Sep 15 12:09:30 2022 -0700 srcu: Convert ->srcu_lock_count and ->srcu_unlock_count to atomic NMI-safe variants of srcu_read_lock() and srcu_read_unlock() are needed by printk(), which on many architectures entails read-modify-write atomic operations. This commit prepares Tree SRCU for this change by making both ->srcu_lock_count and ->srcu_unlock_count by atomic_long_t. [ paulmck: Apply feedback from John Ogness. ] Link: https://lore.kernel.org/all/20220910221947.171557773@linutronix.de/ Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: John Ogness <john.ogness@linutronix.de> Cc: Petr Mladek <pmladek@suse.com> diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h index e3014319d1ad..0c4eca07d78d 100644 --- a/include/linux/srcutree.h +++ b/include/linux/srcutree.h @@ -23,8 +23,8 @@ struct srcu_struct; */ struct srcu_data { /* Read-side state. */ - unsigned long srcu_lock_count[2]; /* Locks per CPU. */ - unsigned long srcu_unlock_count[2]; /* Unlocks per CPU. */ + atomic_long_t srcu_lock_count[2]; /* Locks per CPU. */ + atomic_long_t srcu_unlock_count[2]; /* Unlocks per CPU. */ /* Update-side state. */ spinlock_t __private lock ____cacheline_internodealigned_in_smp; diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c index 1c304fec89c0..25e9458da6a2 100644 --- a/kernel/rcu/srcutree.c +++ b/kernel/rcu/srcutree.c @@ -417,7 +417,7 @@ static unsigned long srcu_readers_lock_idx(struct srcu_struct *ssp, int idx) for_each_possible_cpu(cpu) { struct srcu_data *cpuc = per_cpu_ptr(ssp->sda, cpu); - sum += READ_ONCE(cpuc->srcu_lock_count[idx]); + sum += atomic_long_read(&cpuc->srcu_lock_count[idx]); } return sum; } @@ -434,7 +434,7 @@ static unsigned long srcu_readers_unlock_idx(struct srcu_struct *ssp, int idx) for_each_possible_cpu(cpu) { struct srcu_data *cpuc = per_cpu_ptr(ssp->sda, cpu); - sum += READ_ONCE(cpuc->srcu_unlock_count[idx]); + sum += atomic_long_read(&cpuc->srcu_unlock_count[idx]); } return sum; } @@ -503,10 +503,10 @@ static bool srcu_readers_active(struct srcu_struct *ssp) for_each_possible_cpu(cpu) { struct srcu_data *cpuc = per_cpu_ptr(ssp->sda, cpu); - sum += READ_ONCE(cpuc->srcu_lock_count[0]); - sum += READ_ONCE(cpuc->srcu_lock_count[1]); - sum -= READ_ONCE(cpuc->srcu_unlock_count[0]); - sum -= READ_ONCE(cpuc->srcu_unlock_count[1]); + sum += atomic_long_read(&cpuc->srcu_lock_count[0]); + sum += atomic_long_read(&cpuc->srcu_lock_count[1]); + sum -= atomic_long_read(&cpuc->srcu_unlock_count[0]); + sum -= atomic_long_read(&cpuc->srcu_unlock_count[1]); } return sum; } @@ -636,7 +636,7 @@ int __srcu_read_lock(struct srcu_struct *ssp) int idx; idx = READ_ONCE(ssp->srcu_idx) & 0x1; - this_cpu_inc(ssp->sda->srcu_lock_count[idx]); + this_cpu_inc(ssp->sda->srcu_lock_count[idx].counter); smp_mb(); /* B */ /* Avoid leaking the critical section. */ return idx; } @@ -650,7 +650,7 @@ EXPORT_SYMBOL_GPL(__srcu_read_lock); void __srcu_read_unlock(struct srcu_struct *ssp, int idx) { smp_mb(); /* C */ /* Avoid leaking the critical section. */ - this_cpu_inc(ssp->sda->srcu_unlock_count[idx]); + this_cpu_inc(ssp->sda->srcu_unlock_count[idx].counter); } EXPORT_SYMBOL_GPL(__srcu_read_unlock); @@ -1687,8 +1687,8 @@ void srcu_torture_stats_print(struct srcu_struct *ssp, char *tt, char *tf) struct srcu_data *sdp; sdp = per_cpu_ptr(ssp->sda, cpu); - u0 = data_race(sdp->srcu_unlock_count[!idx]); - u1 = data_race(sdp->srcu_unlock_count[idx]); + u0 = data_race(atomic_long_read(&sdp->srcu_unlock_count[!idx])); + u1 = data_race(atomic_long_read(&sdp->srcu_unlock_count[idx])); /* * Make sure that a lock is always counted if the corresponding @@ -1696,8 +1696,8 @@ void srcu_torture_stats_print(struct srcu_struct *ssp, char *tt, char *tf) */ smp_rmb(); - l0 = data_race(sdp->srcu_lock_count[!idx]); - l1 = data_race(sdp->srcu_lock_count[idx]); + l0 = data_race(atomic_long_read(&sdp->srcu_lock_count[!idx])); + l1 = data_race(atomic_long_read(&sdp->srcu_lock_count[idx])); c0 = l0 - u0; c1 = l1 - u1;
diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h index e3014319d1ad..0c4eca07d78d 100644 --- a/include/linux/srcutree.h +++ b/include/linux/srcutree.h @@ -23,8 +23,8 @@ struct srcu_struct; */ struct srcu_data { /* Read-side state. */ - unsigned long srcu_lock_count[2]; /* Locks per CPU. */ - unsigned long srcu_unlock_count[2]; /* Unlocks per CPU. */ + atomic_long_t srcu_lock_count[2]; /* Locks per CPU. */ + atomic_long_t srcu_unlock_count[2]; /* Unlocks per CPU. */ /* Update-side state. */ spinlock_t __private lock ____cacheline_internodealigned_in_smp; diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c index 1c304fec89c0..6fd0665f4d1f 100644 --- a/kernel/rcu/srcutree.c +++ b/kernel/rcu/srcutree.c @@ -417,7 +417,7 @@ static unsigned long srcu_readers_lock_idx(struct srcu_struct *ssp, int idx) for_each_possible_cpu(cpu) { struct srcu_data *cpuc = per_cpu_ptr(ssp->sda, cpu); - sum += READ_ONCE(cpuc->srcu_lock_count[idx]); + sum += atomic_long_read(&cpuc->srcu_lock_count[idx]); } return sum; } @@ -434,7 +434,7 @@ static unsigned long srcu_readers_unlock_idx(struct srcu_struct *ssp, int idx) for_each_possible_cpu(cpu) { struct srcu_data *cpuc = per_cpu_ptr(ssp->sda, cpu); - sum += READ_ONCE(cpuc->srcu_unlock_count[idx]); + sum += atomic_long_read(&cpuc->srcu_unlock_count[idx]); } return sum; } @@ -503,10 +503,10 @@ static bool srcu_readers_active(struct srcu_struct *ssp) for_each_possible_cpu(cpu) { struct srcu_data *cpuc = per_cpu_ptr(ssp->sda, cpu); - sum += READ_ONCE(cpuc->srcu_lock_count[0]); - sum += READ_ONCE(cpuc->srcu_lock_count[1]); - sum -= READ_ONCE(cpuc->srcu_unlock_count[0]); - sum -= READ_ONCE(cpuc->srcu_unlock_count[1]); + sum += atomic_long_read(&cpuc->srcu_lock_count[0]); + sum += atomic_long_read(&cpuc->srcu_lock_count[1]); + sum -= atomic_long_read(&cpuc->srcu_unlock_count[0]); + sum -= atomic_long_read(&cpuc->srcu_unlock_count[1]); } return sum; } @@ -636,7 +636,7 @@ int __srcu_read_lock(struct srcu_struct *ssp) int idx; idx = READ_ONCE(ssp->srcu_idx) & 0x1; - this_cpu_inc(ssp->sda->srcu_lock_count[idx]); + this_cpu_inc(ssp->sda->srcu_lock_count[idx].counter); smp_mb(); /* B */ /* Avoid leaking the critical section. */ return idx; } @@ -650,7 +650,7 @@ EXPORT_SYMBOL_GPL(__srcu_read_lock); void __srcu_read_unlock(struct srcu_struct *ssp, int idx) { smp_mb(); /* C */ /* Avoid leaking the critical section. */ - this_cpu_inc(ssp->sda->srcu_unlock_count[idx]); + this_cpu_inc(ssp->sda->srcu_unlock_count[idx].counter); } EXPORT_SYMBOL_GPL(__srcu_read_unlock); @@ -1687,8 +1687,8 @@ void srcu_torture_stats_print(struct srcu_struct *ssp, char *tt, char *tf) struct srcu_data *sdp; sdp = per_cpu_ptr(ssp->sda, cpu); - u0 = data_race(sdp->srcu_unlock_count[!idx]); - u1 = data_race(sdp->srcu_unlock_count[idx]); + u0 = data_race(sdp->srcu_unlock_count[!idx].counter); + u1 = data_race(sdp->srcu_unlock_count[idx].counter); /* * Make sure that a lock is always counted if the corresponding @@ -1696,8 +1696,8 @@ void srcu_torture_stats_print(struct srcu_struct *ssp, char *tt, char *tf) */ smp_rmb(); - l0 = data_race(sdp->srcu_lock_count[!idx]); - l1 = data_race(sdp->srcu_lock_count[idx]); + l0 = data_race(sdp->srcu_lock_count[!idx].counter); + l1 = data_race(sdp->srcu_lock_count[idx].counter); c0 = l0 - u0; c1 = l1 - u1;
NMI-safe variants of srcu_read_lock() and srcu_read_unlock() are needed by printk(), which on many architectures entails read-modify-write atomic operations. This commit prepares Tree SRCU for this change by making both ->srcu_lock_count and ->srcu_unlock_count by atomic_long_t. Link: https://lore.kernel.org/all/20220910221947.171557773@linutronix.de/ Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: John Ogness <john.ogness@linutronix.de> Cc: Petr Mladek <pmladek@suse.com> --- include/linux/srcutree.h | 4 ++-- kernel/rcu/srcutree.c | 24 ++++++++++++------------ 2 files changed, 14 insertions(+), 14 deletions(-)