Message ID | 20241112013143.1926484-2-paulmck@kernel.org (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [1/3] srcu: Remove smp_mb() from srcu_read_unlock_lite() | expand |
On 11/12/2024 7:01 AM, Paul E. McKenney wrote: > If srcu_read_lock_lite() is used on a given srcu_struct structure, then > the grace-period processing must to synchronize_rcu() instead of smp_mb() s/to/do/ > between the scans of the ->srcu_unlock_count[] and ->srcu_lock_count[] > counters. Currently, it does that by testing the SRCU_READ_FLAVOR_LITE > bit of the ->srcu_reader_flavor mask, which works well. But only if > the CPU running that srcu_struct structure's grace period has previously > executed srcu_read_lock_lite(), which might not be the case, especially > just after that srcu_struct structure has been created and initialized. > > This commit therefore updates the srcu_readers_unlock_idx() function > to OR together the ->srcu_reader_flavor masks from all CPUs, and > then make the srcu_readers_active_idx_check() function that test the > SRCU_READ_FLAVOR_LITE bit in the resulting mask. > > Note that the srcu_readers_unlock_idx() function is already scanning all > the CPUs to sum up the ->srcu_unlock_count[] fields and that this is on > the grace-period slow path, hence no concerns about the small amount of > extra work. > > Reported-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> > Closes: https://lore.kernel.org/all/d07e8f4a-d5ff-4c8e-8e61-50db285c57e9@amd.com/ > Fixes: c0f08d6b5a61 ("srcu: Add srcu_read_lock_lite() and srcu_read_unlock_lite()") > Signed-off-by: Paul E. McKenney <paulmck@kernel.org> > Cc: Frederic Weisbecker <frederic@kernel.org> > --- > kernel/rcu/srcutree.c | 11 ++++++----- > 1 file changed, 6 insertions(+), 5 deletions(-) > > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c > index 70979f294768c..5991381b44383 100644 > --- a/kernel/rcu/srcutree.c > +++ b/kernel/rcu/srcutree.c > @@ -458,7 +458,7 @@ static bool srcu_readers_lock_idx(struct srcu_struct *ssp, int idx, bool gp, uns > * Returns approximate total of the readers' ->srcu_unlock_count[] values > * for the rank of per-CPU counters specified by idx. > */ > -static unsigned long srcu_readers_unlock_idx(struct srcu_struct *ssp, int idx) > +static unsigned long srcu_readers_unlock_idx(struct srcu_struct *ssp, int idx, unsigned long *rdm) > { > int cpu; > unsigned long mask = 0; > @@ -468,11 +468,11 @@ static unsigned long srcu_readers_unlock_idx(struct srcu_struct *ssp, int idx) > struct srcu_data *sdp = per_cpu_ptr(ssp->sda, cpu); > > sum += atomic_long_read(&sdp->srcu_unlock_count[idx]); > - if (IS_ENABLED(CONFIG_PROVE_RCU)) > - mask = mask | READ_ONCE(sdp->srcu_reader_flavor); > + mask = mask | READ_ONCE(sdp->srcu_reader_flavor); > } > WARN_ONCE(IS_ENABLED(CONFIG_PROVE_RCU) && (mask & (mask - 1)), > "Mixed reader flavors for srcu_struct at %ps.\n", ssp); > + *rdm = mask; > return sum; > } > > @@ -482,10 +482,11 @@ static unsigned long srcu_readers_unlock_idx(struct srcu_struct *ssp, int idx) > */ > static bool srcu_readers_active_idx_check(struct srcu_struct *ssp, int idx) > { > - bool did_gp = !!(raw_cpu_read(ssp->sda->srcu_reader_flavor) & SRCU_READ_FLAVOR_LITE); > + unsigned long rdm; > unsigned long unlocks; > > - unlocks = srcu_readers_unlock_idx(ssp, idx); > + unlocks = srcu_readers_unlock_idx(ssp, idx, &rdm); > + bool did_gp = !!(rdm & SRCU_READ_FLAVOR_LITE); Move "did_gp" declaration up? Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> -Neeraj
On Tue, Nov 12, 2024 at 08:58:08AM +0530, Neeraj Upadhyay wrote: > On 11/12/2024 7:01 AM, Paul E. McKenney wrote: > > If srcu_read_lock_lite() is used on a given srcu_struct structure, then > > the grace-period processing must to synchronize_rcu() instead of smp_mb() > > s/to/do/ Good eyes, fixed! > > between the scans of the ->srcu_unlock_count[] and ->srcu_lock_count[] > > counters. Currently, it does that by testing the SRCU_READ_FLAVOR_LITE > > bit of the ->srcu_reader_flavor mask, which works well. But only if > > the CPU running that srcu_struct structure's grace period has previously > > executed srcu_read_lock_lite(), which might not be the case, especially > > just after that srcu_struct structure has been created and initialized. > > > > This commit therefore updates the srcu_readers_unlock_idx() function > > to OR together the ->srcu_reader_flavor masks from all CPUs, and > > then make the srcu_readers_active_idx_check() function that test the > > SRCU_READ_FLAVOR_LITE bit in the resulting mask. > > > > Note that the srcu_readers_unlock_idx() function is already scanning all > > the CPUs to sum up the ->srcu_unlock_count[] fields and that this is on > > the grace-period slow path, hence no concerns about the small amount of > > extra work. > > > > Reported-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> > > Closes: https://lore.kernel.org/all/d07e8f4a-d5ff-4c8e-8e61-50db285c57e9@amd.com/ > > Fixes: c0f08d6b5a61 ("srcu: Add srcu_read_lock_lite() and srcu_read_unlock_lite()") > > Signed-off-by: Paul E. McKenney <paulmck@kernel.org> > > Cc: Frederic Weisbecker <frederic@kernel.org> > > --- > > kernel/rcu/srcutree.c | 11 ++++++----- > > 1 file changed, 6 insertions(+), 5 deletions(-) > > > > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c > > index 70979f294768c..5991381b44383 100644 > > --- a/kernel/rcu/srcutree.c > > +++ b/kernel/rcu/srcutree.c > > @@ -458,7 +458,7 @@ static bool srcu_readers_lock_idx(struct srcu_struct *ssp, int idx, bool gp, uns > > * Returns approximate total of the readers' ->srcu_unlock_count[] values > > * for the rank of per-CPU counters specified by idx. > > */ > > -static unsigned long srcu_readers_unlock_idx(struct srcu_struct *ssp, int idx) > > +static unsigned long srcu_readers_unlock_idx(struct srcu_struct *ssp, int idx, unsigned long *rdm) > > { > > int cpu; > > unsigned long mask = 0; > > @@ -468,11 +468,11 @@ static unsigned long srcu_readers_unlock_idx(struct srcu_struct *ssp, int idx) > > struct srcu_data *sdp = per_cpu_ptr(ssp->sda, cpu); > > > > sum += atomic_long_read(&sdp->srcu_unlock_count[idx]); > > - if (IS_ENABLED(CONFIG_PROVE_RCU)) > > - mask = mask | READ_ONCE(sdp->srcu_reader_flavor); > > + mask = mask | READ_ONCE(sdp->srcu_reader_flavor); > > } > > WARN_ONCE(IS_ENABLED(CONFIG_PROVE_RCU) && (mask & (mask - 1)), > > "Mixed reader flavors for srcu_struct at %ps.\n", ssp); > > + *rdm = mask; > > return sum; > > } > > > > @@ -482,10 +482,11 @@ static unsigned long srcu_readers_unlock_idx(struct srcu_struct *ssp, int idx) > > */ > > static bool srcu_readers_active_idx_check(struct srcu_struct *ssp, int idx) > > { > > - bool did_gp = !!(raw_cpu_read(ssp->sda->srcu_reader_flavor) & SRCU_READ_FLAVOR_LITE); > > + unsigned long rdm; > > unsigned long unlocks; > > > > - unlocks = srcu_readers_unlock_idx(ssp, idx); > > + unlocks = srcu_readers_unlock_idx(ssp, idx, &rdm); > > + bool did_gp = !!(rdm & SRCU_READ_FLAVOR_LITE); > > Move "did_gp" declaration up? C now allows this? ;-) Fixed! > Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> And applied all three, again, thank you! Thanx, Paul
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c index 70979f294768c..5991381b44383 100644 --- a/kernel/rcu/srcutree.c +++ b/kernel/rcu/srcutree.c @@ -458,7 +458,7 @@ static bool srcu_readers_lock_idx(struct srcu_struct *ssp, int idx, bool gp, uns * Returns approximate total of the readers' ->srcu_unlock_count[] values * for the rank of per-CPU counters specified by idx. */ -static unsigned long srcu_readers_unlock_idx(struct srcu_struct *ssp, int idx) +static unsigned long srcu_readers_unlock_idx(struct srcu_struct *ssp, int idx, unsigned long *rdm) { int cpu; unsigned long mask = 0; @@ -468,11 +468,11 @@ static unsigned long srcu_readers_unlock_idx(struct srcu_struct *ssp, int idx) struct srcu_data *sdp = per_cpu_ptr(ssp->sda, cpu); sum += atomic_long_read(&sdp->srcu_unlock_count[idx]); - if (IS_ENABLED(CONFIG_PROVE_RCU)) - mask = mask | READ_ONCE(sdp->srcu_reader_flavor); + mask = mask | READ_ONCE(sdp->srcu_reader_flavor); } WARN_ONCE(IS_ENABLED(CONFIG_PROVE_RCU) && (mask & (mask - 1)), "Mixed reader flavors for srcu_struct at %ps.\n", ssp); + *rdm = mask; return sum; } @@ -482,10 +482,11 @@ static unsigned long srcu_readers_unlock_idx(struct srcu_struct *ssp, int idx) */ static bool srcu_readers_active_idx_check(struct srcu_struct *ssp, int idx) { - bool did_gp = !!(raw_cpu_read(ssp->sda->srcu_reader_flavor) & SRCU_READ_FLAVOR_LITE); + unsigned long rdm; unsigned long unlocks; - unlocks = srcu_readers_unlock_idx(ssp, idx); + unlocks = srcu_readers_unlock_idx(ssp, idx, &rdm); + bool did_gp = !!(rdm & SRCU_READ_FLAVOR_LITE); /* * Make sure that a lock is always counted if the corresponding
If srcu_read_lock_lite() is used on a given srcu_struct structure, then the grace-period processing must to synchronize_rcu() instead of smp_mb() between the scans of the ->srcu_unlock_count[] and ->srcu_lock_count[] counters. Currently, it does that by testing the SRCU_READ_FLAVOR_LITE bit of the ->srcu_reader_flavor mask, which works well. But only if the CPU running that srcu_struct structure's grace period has previously executed srcu_read_lock_lite(), which might not be the case, especially just after that srcu_struct structure has been created and initialized. This commit therefore updates the srcu_readers_unlock_idx() function to OR together the ->srcu_reader_flavor masks from all CPUs, and then make the srcu_readers_active_idx_check() function that test the SRCU_READ_FLAVOR_LITE bit in the resulting mask. Note that the srcu_readers_unlock_idx() function is already scanning all the CPUs to sum up the ->srcu_unlock_count[] fields and that this is on the grace-period slow path, hence no concerns about the small amount of extra work. Reported-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Closes: https://lore.kernel.org/all/d07e8f4a-d5ff-4c8e-8e61-50db285c57e9@amd.com/ Fixes: c0f08d6b5a61 ("srcu: Add srcu_read_lock_lite() and srcu_read_unlock_lite()") Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Frederic Weisbecker <frederic@kernel.org> --- kernel/rcu/srcutree.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)