Message ID | 20240507093530.3043-26-urezki@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | RCU changes for v6.10 | expand |
Hello, I feel I don't really like this patch but I am travelling without my working laptop, can't read the source code ;) Quite possibly I am wrong, I'll return to this when I get back on May 10. Oleg. On 05/07, Uladzislau Rezki (Sony) wrote: > > From: "Paul E. McKenney" <paulmck@kernel.org> > > The rcu_sync structure's ->gp_count field is updated under the protection > of ->rss_lock, but read locklessly, and KCSAN noted the data race. > This commit therefore uses WRITE_ONCE() to do this update to clearly > document its racy nature. > > Signed-off-by: Paul E. McKenney <paulmck@kernel.org> > Cc: Oleg Nesterov <oleg@redhat.com> > Cc: Peter Zijlstra <peterz@infradead.org> > Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> > --- > kernel/rcu/sync.c | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/kernel/rcu/sync.c b/kernel/rcu/sync.c > index 86df878a2fee..6c2bd9001adc 100644 > --- a/kernel/rcu/sync.c > +++ b/kernel/rcu/sync.c > @@ -122,7 +122,7 @@ void rcu_sync_enter(struct rcu_sync *rsp) > * we are called at early boot time but this shouldn't happen. > */ > } > - rsp->gp_count++; > + WRITE_ONCE(rsp->gp_count, rsp->gp_count + 1); > spin_unlock_irq(&rsp->rss_lock); > > if (gp_state == GP_IDLE) { > @@ -151,11 +151,15 @@ void rcu_sync_enter(struct rcu_sync *rsp) > */ > void rcu_sync_exit(struct rcu_sync *rsp) > { > + int gpc; > + > WARN_ON_ONCE(READ_ONCE(rsp->gp_state) == GP_IDLE); > WARN_ON_ONCE(READ_ONCE(rsp->gp_count) == 0); > > spin_lock_irq(&rsp->rss_lock); > - if (!--rsp->gp_count) { > + gpc = rsp->gp_count - 1; > + WRITE_ONCE(rsp->gp_count, gpc); > + if (!gpc) { > if (rsp->gp_state == GP_PASSED) { > WRITE_ONCE(rsp->gp_state, GP_EXIT); > rcu_sync_call(rsp); > -- > 2.39.2 >
On Tue, May 07, 2024 at 10:54:41AM -0400, Oleg Nesterov wrote: > Hello, > > I feel I don't really like this patch but I am travelling without my working > laptop, can't read the source code ;) Quite possibly I am wrong, I'll return > to this when I get back on May 10. By the stricter data-race rules used in RCU code [1], this is a bug that needs to be fixed. This code is updating ->gp_count, which is read locklessly, which in turn results in a data race. The fix is to mark the updates (as below) with WRITE_ONCE(). Or is there something in one or the other of these updates to ->gp_count that excludes lockless readers? (I am not seeing it, but you know this code way better than I do!) Thanx, Paul [1] https://docs.google.com/document/d/1FwZaXSg3A55ivVoWffA9iMuhJ3_Gmj_E494dLYjjyLQ/edit?usp=sharing > Oleg. > > On 05/07, Uladzislau Rezki (Sony) wrote: > > > > From: "Paul E. McKenney" <paulmck@kernel.org> > > > > The rcu_sync structure's ->gp_count field is updated under the protection > > of ->rss_lock, but read locklessly, and KCSAN noted the data race. > > This commit therefore uses WRITE_ONCE() to do this update to clearly > > document its racy nature. > > > > Signed-off-by: Paul E. McKenney <paulmck@kernel.org> > > Cc: Oleg Nesterov <oleg@redhat.com> > > Cc: Peter Zijlstra <peterz@infradead.org> > > Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> > > --- > > kernel/rcu/sync.c | 8 ++++++-- > > 1 file changed, 6 insertions(+), 2 deletions(-) > > > > diff --git a/kernel/rcu/sync.c b/kernel/rcu/sync.c > > index 86df878a2fee..6c2bd9001adc 100644 > > --- a/kernel/rcu/sync.c > > +++ b/kernel/rcu/sync.c > > @@ -122,7 +122,7 @@ void rcu_sync_enter(struct rcu_sync *rsp) > > * we are called at early boot time but this shouldn't happen. > > */ > > } > > - rsp->gp_count++; > > + WRITE_ONCE(rsp->gp_count, rsp->gp_count + 1); > > spin_unlock_irq(&rsp->rss_lock); > > > > if (gp_state == GP_IDLE) { > > @@ -151,11 +151,15 @@ void rcu_sync_enter(struct rcu_sync *rsp) > > */ > > void rcu_sync_exit(struct rcu_sync *rsp) > > { > > + int gpc; > > + > > WARN_ON_ONCE(READ_ONCE(rsp->gp_state) == GP_IDLE); > > WARN_ON_ONCE(READ_ONCE(rsp->gp_count) == 0); > > > > spin_lock_irq(&rsp->rss_lock); > > - if (!--rsp->gp_count) { > > + gpc = rsp->gp_count - 1; > > + WRITE_ONCE(rsp->gp_count, gpc); > > + if (!gpc) { > > if (rsp->gp_state == GP_PASSED) { > > WRITE_ONCE(rsp->gp_state, GP_EXIT); > > rcu_sync_call(rsp); > > -- > > 2.39.2 > > >
On 05/07, Paul E. McKenney wrote: > > On Tue, May 07, 2024 at 10:54:41AM -0400, Oleg Nesterov wrote: > > Hello, > > > > I feel I don't really like this patch but I am travelling without my working > > laptop, can't read the source code ;) Quite possibly I am wrong, I'll return > > to this when I get back on May 10. > > By the stricter data-race rules used in RCU code [1], this is a bug that > needs to be fixed. Now that I can read the code... Sorry, still can't understand. > which is read locklessly, Where??? OK, OK, we have // rcu_sync_exit() WARN_ON_ONCE(READ_ONCE(rsp->gp_count) == 0) and // rcu_sync_dtor() WARN_ON_ONCE(READ_ONCE(rsp->gp_count)); other than that ->gp_count is always accessed under ->rss_lock. And yes, at least WARN_ON_ONCE() in rcu_sync_exit() can obviously race with rcu_sync_enter/exit which update gp_count. I think this is fine correctness-wise. But OK, we need to please KCSAN (or is there another problem I missed ???) We can move these WARN_ON()'s into the ->rss_lock protected section. Or perhaps we can use data_race(rsp->gp_count) ? To be honest I thought that READ_ONCE() should be enough... Or we can simply kill these WARN_ON_ONCE()'s. I don't understand why should we add more WRITE_ONCE()'s into the critical section protected by ->rss_lock. Help! ;) Oleg. which in turn results in a data race. The fix is to mark > the updates (as below) with WRITE_ONCE(). > > Or is there something in one or the other of these updates to ->gp_count > that excludes lockless readers? (I am not seeing it, but you know this > code way better than I do!) > > Thanx, Paul > > [1] https://docs.google.com/document/d/1FwZaXSg3A55ivVoWffA9iMuhJ3_Gmj_E494dLYjjyLQ/edit?usp=sharing > > > Oleg. > > > > On 05/07, Uladzislau Rezki (Sony) wrote: > > > > > > From: "Paul E. McKenney" <paulmck@kernel.org> > > > > > > The rcu_sync structure's ->gp_count field is updated under the protection > > > of ->rss_lock, but read locklessly, and KCSAN noted the data race. > > > This commit therefore uses WRITE_ONCE() to do this update to clearly > > > document its racy nature. > > > > > > Signed-off-by: Paul E. McKenney <paulmck@kernel.org> > > > Cc: Oleg Nesterov <oleg@redhat.com> > > > Cc: Peter Zijlstra <peterz@infradead.org> > > > Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> > > > --- > > > kernel/rcu/sync.c | 8 ++++++-- > > > 1 file changed, 6 insertions(+), 2 deletions(-) > > > > > > diff --git a/kernel/rcu/sync.c b/kernel/rcu/sync.c > > > index 86df878a2fee..6c2bd9001adc 100644 > > > --- a/kernel/rcu/sync.c > > > +++ b/kernel/rcu/sync.c > > > @@ -122,7 +122,7 @@ void rcu_sync_enter(struct rcu_sync *rsp) > > > * we are called at early boot time but this shouldn't happen. > > > */ > > > } > > > - rsp->gp_count++; > > > + WRITE_ONCE(rsp->gp_count, rsp->gp_count + 1); > > > spin_unlock_irq(&rsp->rss_lock); > > > > > > if (gp_state == GP_IDLE) { > > > @@ -151,11 +151,15 @@ void rcu_sync_enter(struct rcu_sync *rsp) > > > */ > > > void rcu_sync_exit(struct rcu_sync *rsp) > > > { > > > + int gpc; > > > + > > > WARN_ON_ONCE(READ_ONCE(rsp->gp_state) == GP_IDLE); > > > WARN_ON_ONCE(READ_ONCE(rsp->gp_count) == 0); > > > > > > spin_lock_irq(&rsp->rss_lock); > > > - if (!--rsp->gp_count) { > > > + gpc = rsp->gp_count - 1; > > > + WRITE_ONCE(rsp->gp_count, gpc); > > > + if (!gpc) { > > > if (rsp->gp_state == GP_PASSED) { > > > WRITE_ONCE(rsp->gp_state, GP_EXIT); > > > rcu_sync_call(rsp); > > > -- > > > 2.39.2 > > > > > >
On Thu, May 09, 2024 at 05:13:12PM +0200, Oleg Nesterov wrote: > On 05/07, Paul E. McKenney wrote: > > > > On Tue, May 07, 2024 at 10:54:41AM -0400, Oleg Nesterov wrote: > > > Hello, > > > > > > I feel I don't really like this patch but I am travelling without my working > > > laptop, can't read the source code ;) Quite possibly I am wrong, I'll return > > > to this when I get back on May 10. > > > > By the stricter data-race rules used in RCU code [1], this is a bug that > > needs to be fixed. > > Now that I can read the code... Sorry, still can't understand. > > > which is read locklessly, > > Where??? > > OK, OK, we have > > // rcu_sync_exit() > WARN_ON_ONCE(READ_ONCE(rsp->gp_count) == 0) > > and > > // rcu_sync_dtor() > WARN_ON_ONCE(READ_ONCE(rsp->gp_count)); > > other than that ->gp_count is always accessed under ->rss_lock. > > And yes, at least WARN_ON_ONCE() in rcu_sync_exit() can obviously race with > rcu_sync_enter/exit which update gp_count. I think this is fine correctness-wise. > > But OK, we need to please KCSAN (or is there another problem I missed ???) > > We can move these WARN_ON()'s into the ->rss_lock protected section. > > Or perhaps we can use data_race(rsp->gp_count) ? To be honest I thought > that READ_ONCE() should be enough... > > Or we can simply kill these WARN_ON_ONCE()'s. Or we could move those WARN_ON_ONCE() under the lock. If this would be a lock-contention issue, we could condition them with something like IS_ENABLED(CONFIG_PROVE_RCU). Then all accesses to those variables would always be protected by the lock, and the WRITE_ONCE() and READ_ONCE() calls could be dropped. (Or am I missing another lockless access?) Which would have the further advantage that if anyone accessed these without holding the lock, KCSAN would complain. > I don't understand why should we add more WRITE_ONCE()'s into the critical > section protected by ->rss_lock. There are indeed several ways to fix this. Which would you prefer? > Help! ;) ;-) ;-) ;-) Thanx, Paul > Oleg. > > > which in turn results in a data race. The fix is to mark > > the updates (as below) with WRITE_ONCE(). > > > > Or is there something in one or the other of these updates to ->gp_count > > that excludes lockless readers? (I am not seeing it, but you know this > > code way better than I do!) > > > > Thanx, Paul > > > > [1] https://docs.google.com/document/d/1FwZaXSg3A55ivVoWffA9iMuhJ3_Gmj_E494dLYjjyLQ/edit?usp=sharing > > > > > Oleg. > > > > > > On 05/07, Uladzislau Rezki (Sony) wrote: > > > > > > > > From: "Paul E. McKenney" <paulmck@kernel.org> > > > > > > > > The rcu_sync structure's ->gp_count field is updated under the protection > > > > of ->rss_lock, but read locklessly, and KCSAN noted the data race. > > > > This commit therefore uses WRITE_ONCE() to do this update to clearly > > > > document its racy nature. > > > > > > > > Signed-off-by: Paul E. McKenney <paulmck@kernel.org> > > > > Cc: Oleg Nesterov <oleg@redhat.com> > > > > Cc: Peter Zijlstra <peterz@infradead.org> > > > > Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> > > > > --- > > > > kernel/rcu/sync.c | 8 ++++++-- > > > > 1 file changed, 6 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/kernel/rcu/sync.c b/kernel/rcu/sync.c > > > > index 86df878a2fee..6c2bd9001adc 100644 > > > > --- a/kernel/rcu/sync.c > > > > +++ b/kernel/rcu/sync.c > > > > @@ -122,7 +122,7 @@ void rcu_sync_enter(struct rcu_sync *rsp) > > > > * we are called at early boot time but this shouldn't happen. > > > > */ > > > > } > > > > - rsp->gp_count++; > > > > + WRITE_ONCE(rsp->gp_count, rsp->gp_count + 1); > > > > spin_unlock_irq(&rsp->rss_lock); > > > > > > > > if (gp_state == GP_IDLE) { > > > > @@ -151,11 +151,15 @@ void rcu_sync_enter(struct rcu_sync *rsp) > > > > */ > > > > void rcu_sync_exit(struct rcu_sync *rsp) > > > > { > > > > + int gpc; > > > > + > > > > WARN_ON_ONCE(READ_ONCE(rsp->gp_state) == GP_IDLE); > > > > WARN_ON_ONCE(READ_ONCE(rsp->gp_count) == 0); > > > > > > > > spin_lock_irq(&rsp->rss_lock); > > > > - if (!--rsp->gp_count) { > > > > + gpc = rsp->gp_count - 1; > > > > + WRITE_ONCE(rsp->gp_count, gpc); > > > > + if (!gpc) { > > > > if (rsp->gp_state == GP_PASSED) { > > > > WRITE_ONCE(rsp->gp_state, GP_EXIT); > > > > rcu_sync_call(rsp); > > > > -- > > > > 2.39.2 > > > > > > > > > >
On 05/09, Paul E. McKenney wrote: > > On Thu, May 09, 2024 at 05:13:12PM +0200, Oleg Nesterov wrote: > > > > We can move these WARN_ON()'s into the ->rss_lock protected section. > > > > Or perhaps we can use data_race(rsp->gp_count) ? To be honest I thought > > that READ_ONCE() should be enough... > > > > Or we can simply kill these WARN_ON_ONCE()'s. > > Or we could move those WARN_ON_ONCE() under the lock. Sure, see above. But could you help me to understand this magic? I naively thought that READ_ONCE() is always "safe"... So, unless I am totally confused it turns out that, say, CPU 0 CPU 1 ----- ----- spin_lock(LOCK); ++X; READ_ONCE(X); // data race spin_unlock(LOCK); is data-racy accoring to KCSAN, while CPU 0 CPU 1 ----- ----- spin_lock(LOCK); WRITE_ONCE(X, X+1); READ_ONCE(X); // no data race spin_unlock(LOCK); is not. Why is that? Trying to read Documentation/dev-tools/kcsan.rst... it says KCSAN is aware of *marked atomic operations* (``READ_ONCE``, WRITE_ONCE``, ... if all accesses to a variable that is accessed concurrently are properly marked, KCSAN will never trigger a watchpoint but how can KCSAN detect that all accesses to X are properly marked? I see nothing KCSAN-related in the definition of WRITE_ONCE() or READ_ONCE(). And what does the "all accesses" above actually mean? The 2nd version does WRITE_ONCE(X, X+1); but "X + 1" is the plain/unmarked access? Thanks, Oleg.
On 05/07, Paul E. McKenney wrote: > > By the stricter data-race rules used in RCU code [1], ... > [1] https://docs.google.com/document/d/1FwZaXSg3A55ivVoWffA9iMuhJ3_Gmj_E494dLYjjyLQ/edit?usp=sharing I am getting more and more confused... Does this mean that KCSAN/etc treats the files in kernel/rcu/ differently than the "Rest of Kernel"? Or what? And how is it enforced? Oleg.
On 05/10, Oleg Nesterov wrote: > > On 05/07, Paul E. McKenney wrote: > > > > By the stricter data-race rules used in RCU code [1], > ... > > [1] https://docs.google.com/document/d/1FwZaXSg3A55ivVoWffA9iMuhJ3_Gmj_E494dLYjjyLQ/edit?usp=sharing > > I am getting more and more confused... > > Does this mean that KCSAN/etc treats the files in kernel/rcu/ > differently than the "Rest of Kernel"? Or what? > > And how is it enforced? I can only find the strnstr(buf, "rcu") checks in skip_report(), but they only cover the KCSAN_REPORT_VALUE_CHANGE_ONLY case... Oleg.
On Fri, May 10, 2024 at 03:18:50PM +0200, Oleg Nesterov wrote: > On 05/07, Paul E. McKenney wrote: > > > > By the stricter data-race rules used in RCU code [1], > ... > > [1] https://docs.google.com/document/d/1FwZaXSg3A55ivVoWffA9iMuhJ3_Gmj_E494dLYjjyLQ/edit?usp=sharing > > I am getting more and more confused... > > Does this mean that KCSAN/etc treats the files in kernel/rcu/ > differently than the "Rest of Kernel"? Or what? Yes. > And how is it enforced? By me running rcutorture with KCSAN with the Kconfig options listed in that docusment. Thanx, Paul
> 2024年5月10日 19:31,Oleg Nesterov <oleg@redhat.com> wrote: > > On 05/09, Paul E. McKenney wrote: >> >> On Thu, May 09, 2024 at 05:13:12PM +0200, Oleg Nesterov wrote: >>> >>> We can move these WARN_ON()'s into the ->rss_lock protected section. >>> >>> Or perhaps we can use data_race(rsp->gp_count) ? To be honest I thought >>> that READ_ONCE() should be enough... >>> >>> Or we can simply kill these WARN_ON_ONCE()'s. >> >> Or we could move those WARN_ON_ONCE() under the lock. > > Sure, see above. > > But could you help me to understand this magic? I naively thought > that READ_ONCE() is always "safe"... > > So, unless I am totally confused it turns out that, say, > > CPU 0 CPU 1 > ----- ----- > > spin_lock(LOCK); > ++X; READ_ONCE(X); // data race > spin_unlock(LOCK); > > is data-racy accoring to KCSAN, while > > CPU 0 CPU 1 > ----- ----- > > spin_lock(LOCK); > WRITE_ONCE(X, X+1); READ_ONCE(X); // no data race > spin_unlock(LOCK); > > is not. > > Why is that? > > Trying to read Documentation/dev-tools/kcsan.rst... it says > > KCSAN is aware of *marked atomic operations* (``READ_ONCE``, WRITE_ONCE``, > > ... > > if all accesses to a variable that is accessed concurrently are properly > marked, KCSAN will never trigger a watchpoint > > but how can KCSAN detect that all accesses to X are properly marked? I see nothing > KCSAN-related in the definition of WRITE_ONCE() or READ_ONCE(). > > And what does the "all accesses" above actually mean? The 2nd version does > > WRITE_ONCE(X, X+1); > > but "X + 1" is the plain/unmarked access? X + 1 and READ_ONCE(X) are two read. > > Thanks, > > Oleg. > >
On Fri, May 10, 2024 at 01:31:49PM +0200, Oleg Nesterov wrote: > On 05/09, Paul E. McKenney wrote: > > > > On Thu, May 09, 2024 at 05:13:12PM +0200, Oleg Nesterov wrote: > > > > > > We can move these WARN_ON()'s into the ->rss_lock protected section. > > > > > > Or perhaps we can use data_race(rsp->gp_count) ? To be honest I thought > > > that READ_ONCE() should be enough... > > > > > > Or we can simply kill these WARN_ON_ONCE()'s. > > > > Or we could move those WARN_ON_ONCE() under the lock. > > Sure, see above. > > But could you help me to understand this magic? I naively thought > that READ_ONCE() is always "safe"... > > So, unless I am totally confused it turns out that, say, > > CPU 0 CPU 1 > ----- ----- > > spin_lock(LOCK); > ++X; READ_ONCE(X); // data race > spin_unlock(LOCK); > > is data-racy accoring to KCSAN, while > > CPU 0 CPU 1 > ----- ----- > > spin_lock(LOCK); > WRITE_ONCE(X, X+1); READ_ONCE(X); // no data race > spin_unlock(LOCK); > > is not. Agreed, in RCU code. > Why is that? Because I run KCSAN on RCU using Kconfig options that cause KCSAN to be more strict. > Trying to read Documentation/dev-tools/kcsan.rst... it says > > KCSAN is aware of *marked atomic operations* (``READ_ONCE``, WRITE_ONCE``, > > ... > > if all accesses to a variable that is accessed concurrently are properly > marked, KCSAN will never trigger a watchpoint > > but how can KCSAN detect that all accesses to X are properly marked? I see nothing > KCSAN-related in the definition of WRITE_ONCE() or READ_ONCE(). The trick is that KCSAN sees the volatile cast that both READ_ONCE() and WRITE_ONCE() use. > And what does the "all accesses" above actually mean? The 2nd version does > > WRITE_ONCE(X, X+1); > > but "X + 1" is the plain/unmarked access? That would be the correct usage in RCU code if there were lockless accesses to X, which would use READ_ONCE(), but a lock was held across that WRITE_ONCE() such that there would be no concurrent updates of X. In that case, the "X+1" cannot be involved in a data race, so KCSAN won't complain. But if all accesses to X were protected by an exclusive lock, then there would be no data races involving X, and thus no marking of any accesses to X. Which would allow KCSAN to detect buggy lockless accesses to X. Thanx, Paul
On Fri, May 10, 2024 at 03:50:57PM +0200, Oleg Nesterov wrote: > On 05/10, Oleg Nesterov wrote: > > > > On 05/07, Paul E. McKenney wrote: > > > > > > By the stricter data-race rules used in RCU code [1], > > ... > > > [1] https://docs.google.com/document/d/1FwZaXSg3A55ivVoWffA9iMuhJ3_Gmj_E494dLYjjyLQ/edit?usp=sharing > > > > I am getting more and more confused... > > > > Does this mean that KCSAN/etc treats the files in kernel/rcu/ > > differently than the "Rest of Kernel"? Or what? > > > > And how is it enforced? > > I can only find the strnstr(buf, "rcu") checks in skip_report(), > but they only cover the KCSAN_REPORT_VALUE_CHANGE_ONLY case... Huh, new one on me! When I run KCSAN, I set CONFIG_KCSAN_STRICT=y, which implies CONFIG_KCSAN_REPORT_VALUE_CHANGE_ONLY=n, which should prevent skip_report() from even being invoked. Which suggests that in the rest of the kernel, including "rcu_" in your function name gets you stricter KCSAN checking. ;-) Thanx, Paul
Sorry for another delay... On 05/10, Paul E. McKenney wrote: > > On Fri, May 10, 2024 at 03:50:57PM +0200, Oleg Nesterov wrote: > > > > I can only find the strnstr(buf, "rcu") checks in skip_report(), > > but they only cover the KCSAN_REPORT_VALUE_CHANGE_ONLY case... > > Huh, new one on me! When I run KCSAN, I set CONFIG_KCSAN_STRICT=y, > which implies CONFIG_KCSAN_REPORT_VALUE_CHANGE_ONLY=n, which should > prevent skip_report() from even being invoked. > > Which suggests that in the rest of the kernel, including "rcu_" > in your function name gets you stricter KCSAN checking. ;-) Yes. And that is why I was very confused. I misinterpreted the "stricter data-race rules used in RCU code" as if there must be more "rcu-only" hacks in the kernel/kcsan/ code which I can't find ;) Oleg.
On 05/10, Paul E. McKenney wrote: > > On Fri, May 10, 2024 at 01:31:49PM +0200, Oleg Nesterov wrote: > > > Why is that? > > Because I run KCSAN on RCU using Kconfig options that cause KCSAN > to be more strict. Yes, I see now. > > but how can KCSAN detect that all accesses to X are properly marked? I see nothing > > KCSAN-related in the definition of WRITE_ONCE() or READ_ONCE(). > > The trick is that KCSAN sees the volatile cast that both READ_ONCE() > and WRITE_ONCE() use. Hmm. grep-grep-grep... I seem to understand, DEFINE_TSAN_VOLATILE_READ_WRITE. So __tsan_volatile_readX() will use KCSAN_ACCESS_ATOMIC. Thanks! > > And what does the "all accesses" above actually mean? The 2nd version does > > > > WRITE_ONCE(X, X+1); > > > > but "X + 1" is the plain/unmarked access? > > ... > > In that case, the "X+1" cannot be involved in a data race, so KCSAN > won't complain. Yes, yes, I understand now. Paul, thanks for your explanations! and sorry for wasting your time by provoking the unnecessarily long discussion. I am going to send the trivial patch which moves these WARN_ON()'s under spin_lock(), this looks more clean to me. But I won't argue if you prefer your original patch. Oleg.
On Sun, May 12, 2024 at 12:53:06PM +0200, Oleg Nesterov wrote: > On 05/10, Paul E. McKenney wrote: > > > > On Fri, May 10, 2024 at 01:31:49PM +0200, Oleg Nesterov wrote: > > > > > Why is that? > > > > Because I run KCSAN on RCU using Kconfig options that cause KCSAN > > to be more strict. > > Yes, I see now. > > > > but how can KCSAN detect that all accesses to X are properly marked? I see nothing > > > KCSAN-related in the definition of WRITE_ONCE() or READ_ONCE(). > > > > The trick is that KCSAN sees the volatile cast that both READ_ONCE() > > and WRITE_ONCE() use. > > Hmm. grep-grep-grep... I seem to understand, DEFINE_TSAN_VOLATILE_READ_WRITE. > So __tsan_volatile_readX() will use KCSAN_ACCESS_ATOMIC. > > Thanks! You got it!!! > > > And what does the "all accesses" above actually mean? The 2nd version does > > > > > > WRITE_ONCE(X, X+1); > > > > > > but "X + 1" is the plain/unmarked access? > > > > ... > > > > In that case, the "X+1" cannot be involved in a data race, so KCSAN > > won't complain. > > Yes, yes, I understand now. > > Paul, thanks for your explanations! and sorry for wasting your time by > provoking the unnecessarily long discussion. Not a problem and absolutely no need to apologize! Of course, please do pass this information on to anyone else needing it. > I am going to send the trivial patch which moves these WARN_ON()'s under > spin_lock(), this looks more clean to me. But I won't argue if you prefer > your original patch. Actually, I like your patch quite a bit better than I do my original. In fact, I feel quite foolish that I did not think of this to begin with. With your way, we have strict locking for that field and can therefore just use plain C-language accesses for all accesses to it. KCSAN will then warn us of any buggy lockless access to that field, even if that buggy access uses READ_ONCE(). Much much better your way!!! So thank you for persisting on this! Thanx, Paul
On Fri, 10 May 2024 at 16:11, Paul E. McKenney <paulmck@kernel.org> wrote: [...] > > > Does this mean that KCSAN/etc treats the files in kernel/rcu/ > > > differently than the "Rest of Kernel"? Or what? > > > > > > And how is it enforced? > > > > I can only find the strnstr(buf, "rcu") checks in skip_report(), > > but they only cover the KCSAN_REPORT_VALUE_CHANGE_ONLY case... > > Huh, new one on me! When I run KCSAN, I set CONFIG_KCSAN_STRICT=y, > which implies CONFIG_KCSAN_REPORT_VALUE_CHANGE_ONLY=n, which should > prevent skip_report() from even being invoked. The strnstr hack goes back to the first version of KCSAN released in v5.8 [1]. It was added in response to Paul wanting the "stricter" treatment for RCU even while KCSAN was still in development, and back then syzbot was the first test system using KCSAN. Shortly after Paul added a KCSAN config for rcutorture with a laundry list of options to make KCSAN "strict" (before we eventually added CONFIG_KCSAN_STRICT which greatly simplified that). While the strnstr(.., "rcu") rules are redundant when using the stricter rules (at least CONFIG_KCSAN_REPORT_VALUE_CHANGE_ONLY=n is set), we're keeping the "rcu" special case around because there are test robots and fuzzers that use the default KCSAN config (unlike rcutorture). And because we know that RCU likes the stricter rules, the "value change only" filter is ignored in all KCSAN configs for rcu-related functions. [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/kcsan/report.c?id=v5.8 Back then syzbot occasionally reported RCU data races, but these days rcutorture probably finds all of them before syzbot ever gets its hands on new code. Thanks, -- Marco
On Mon, May 13, 2024 at 04:13:35PM +0200, Marco Elver wrote: > On Fri, 10 May 2024 at 16:11, Paul E. McKenney <paulmck@kernel.org> wrote: > [...] > > > > Does this mean that KCSAN/etc treats the files in kernel/rcu/ > > > > differently than the "Rest of Kernel"? Or what? > > > > > > > > And how is it enforced? > > > > > > I can only find the strnstr(buf, "rcu") checks in skip_report(), > > > but they only cover the KCSAN_REPORT_VALUE_CHANGE_ONLY case... > > > > Huh, new one on me! When I run KCSAN, I set CONFIG_KCSAN_STRICT=y, > > which implies CONFIG_KCSAN_REPORT_VALUE_CHANGE_ONLY=n, which should > > prevent skip_report() from even being invoked. > > The strnstr hack goes back to the first version of KCSAN released in > v5.8 [1]. It was added in response to Paul wanting the "stricter" > treatment for RCU even while KCSAN was still in development, and back > then syzbot was the first test system using KCSAN. Shortly after Paul > added a KCSAN config for rcutorture with a laundry list of options to > make KCSAN "strict" (before we eventually added CONFIG_KCSAN_STRICT > which greatly simplified that). > > While the strnstr(.., "rcu") rules are redundant when using the > stricter rules (at least CONFIG_KCSAN_REPORT_VALUE_CHANGE_ONLY=n is > set), we're keeping the "rcu" special case around because there are > test robots and fuzzers that use the default KCSAN config (unlike > rcutorture). And because we know that RCU likes the stricter rules, > the "value change only" filter is ignored in all KCSAN configs for > rcu-related functions. > > [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/kcsan/report.c?id=v5.8 Thank you for the background information! > Back then syzbot occasionally reported RCU data races, but these days > rcutorture probably finds all of them before syzbot ever gets its > hands on new code. I do try my best. ;-) Thanx, Paul
diff --git a/kernel/rcu/sync.c b/kernel/rcu/sync.c index 86df878a2fee..6c2bd9001adc 100644 --- a/kernel/rcu/sync.c +++ b/kernel/rcu/sync.c @@ -122,7 +122,7 @@ void rcu_sync_enter(struct rcu_sync *rsp) * we are called at early boot time but this shouldn't happen. */ } - rsp->gp_count++; + WRITE_ONCE(rsp->gp_count, rsp->gp_count + 1); spin_unlock_irq(&rsp->rss_lock); if (gp_state == GP_IDLE) { @@ -151,11 +151,15 @@ void rcu_sync_enter(struct rcu_sync *rsp) */ void rcu_sync_exit(struct rcu_sync *rsp) { + int gpc; + WARN_ON_ONCE(READ_ONCE(rsp->gp_state) == GP_IDLE); WARN_ON_ONCE(READ_ONCE(rsp->gp_count) == 0); spin_lock_irq(&rsp->rss_lock); - if (!--rsp->gp_count) { + gpc = rsp->gp_count - 1; + WRITE_ONCE(rsp->gp_count, gpc); + if (!gpc) { if (rsp->gp_state == GP_PASSED) { WRITE_ONCE(rsp->gp_state, GP_EXIT); rcu_sync_call(rsp);