Message ID | 20190610191420.27007-13-kent.overstreet@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [01/12] Compiler Attributes: add __flatten | expand |
Hi Kent, On 2019/6/11 3:14 上午, Kent Overstreet wrote: > Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Acked-by: Coly Li <colyli@suse.de> And also I receive report for suspicious closure race condition in bcache, and people ask for having this patch into Linux v5.3. So before this patch gets merged into upstream, I plan to rebase it to drivers/md/bcache/closure.c at this moment. Of cause the author is you. When lib/closure.c merged into upstream, I will rebase all closure usage from bcache to use lib/closure.{c,h}. Thanks in advance. Coly Li > --- > lib/closure.c | 10 ++++++++-- > 1 file changed, 8 insertions(+), 2 deletions(-) > > diff --git a/lib/closure.c b/lib/closure.c > index 46cfe4c382..3e6366c262 100644 > --- a/lib/closure.c > +++ b/lib/closure.c > @@ -104,8 +104,14 @@ struct closure_syncer { > > static void closure_sync_fn(struct closure *cl) > { > - cl->s->done = 1; > - wake_up_process(cl->s->task); > + struct closure_syncer *s = cl->s; > + struct task_struct *p; > + > + rcu_read_lock(); > + p = READ_ONCE(s->task); > + s->done = 1; > + wake_up_process(p); > + rcu_read_unlock(); > } > > void __sched __closure_sync(struct closure *cl) >
On 2019/7/16 6:47 下午, Coly Li wrote: > Hi Kent, > > On 2019/6/11 3:14 上午, Kent Overstreet wrote: >> Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> > Acked-by: Coly Li <colyli@suse.de> > > And also I receive report for suspicious closure race condition in > bcache, and people ask for having this patch into Linux v5.3. > > So before this patch gets merged into upstream, I plan to rebase it to > drivers/md/bcache/closure.c at this moment. Of cause the author is you. > > When lib/closure.c merged into upstream, I will rebase all closure usage > from bcache to use lib/closure.{c,h}. Hi Kent, The race bug reporter replies me that the closure race bug is very rare to reproduce, after applying the patch and testing, they are not sure whether their closure race problem is fixed or not. And I notice rcu_read_lock()/rcu_read_unlock() is used here, but it is not clear to me what is the functionality of the rcu read lock in closure_sync_fn(). I believe you have reason to use the rcu stuffs here, could you please provide some hints to help me to understand the change better ? Thanks in advance. Coly Li >> --- >> lib/closure.c | 10 ++++++++-- >> 1 file changed, 8 insertions(+), 2 deletions(-) >> >> diff --git a/lib/closure.c b/lib/closure.c >> index 46cfe4c382..3e6366c262 100644 >> --- a/lib/closure.c >> +++ b/lib/closure.c >> @@ -104,8 +104,14 @@ struct closure_syncer { >> >> static void closure_sync_fn(struct closure *cl) >> { >> - cl->s->done = 1; >> - wake_up_process(cl->s->task); >> + struct closure_syncer *s = cl->s; >> + struct task_struct *p; >> + >> + rcu_read_lock(); >> + p = READ_ONCE(s->task); >> + s->done = 1; >> + wake_up_process(p); >> + rcu_read_unlock(); >> } >> >> void __sched __closure_sync(struct closure *cl)
On Thu, Jul 18, 2019 at 03:46:46PM +0800, Coly Li wrote: > On 2019/7/16 6:47 下午, Coly Li wrote: > > Hi Kent, > > > > On 2019/6/11 3:14 上午, Kent Overstreet wrote: > >> Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> > > Acked-by: Coly Li <colyli@suse.de> > > > > And also I receive report for suspicious closure race condition in > > bcache, and people ask for having this patch into Linux v5.3. > > > > So before this patch gets merged into upstream, I plan to rebase it to > > drivers/md/bcache/closure.c at this moment. Of cause the author is you. > > > > When lib/closure.c merged into upstream, I will rebase all closure usage > > from bcache to use lib/closure.{c,h}. > > Hi Kent, > > The race bug reporter replies me that the closure race bug is very rare > to reproduce, after applying the patch and testing, they are not sure > whether their closure race problem is fixed or not. > > And I notice rcu_read_lock()/rcu_read_unlock() is used here, but it is > not clear to me what is the functionality of the rcu read lock in > closure_sync_fn(). I believe you have reason to use the rcu stuffs here, > could you please provide some hints to help me to understand the change > better ? The race was when a thread using closure_sync() notices cl->s->done == 1 before the thread calling closure_put() calls wake_up_process(). Then, it's possible for that thread to return and exit just before wake_up_process() is called - so we're trying to wake up a process that no longer exists. rcu_read_lock() is sufficient to protect against this, as there's an rcu barrier somewhere in the process teardown path.
diff --git a/lib/closure.c b/lib/closure.c index 46cfe4c382..3e6366c262 100644 --- a/lib/closure.c +++ b/lib/closure.c @@ -104,8 +104,14 @@ struct closure_syncer { static void closure_sync_fn(struct closure *cl) { - cl->s->done = 1; - wake_up_process(cl->s->task); + struct closure_syncer *s = cl->s; + struct task_struct *p; + + rcu_read_lock(); + p = READ_ONCE(s->task); + s->done = 1; + wake_up_process(p); + rcu_read_unlock(); } void __sched __closure_sync(struct closure *cl)
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> --- lib/closure.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)