Message ID | 1521235867-29476-2-git-send-email-bmarzins@redhat.com (mailing list archive) |
---|---|
State | Not Applicable, archived |
Delegated to: | christophe varoqui |
Headers | show |
On Fri, 2018-03-16 at 16:31 -0500, Benjamin Marzinski wrote: > multipathd was sending a signal to the per-device waiter thread after > cancelling it. Since these threads are detached, there is a window > where a new thread could have started with the old thread id between > the > cancel and the signalling, causing the signal to be delivered to the > wrong thread. Simply reversing the order doesn't fix the issue, since > the waiter threads exit immediately if they receive a signal, again > opening a window for the cancel to be delivered to the wrong thread. > > To fix this, multipathd does reverse the order, so that it signals > the > thread first (it needs to signal the thread, since the dm_task_run > ioctl > isn't a cancellation point) and then cancels it. However it does this > while holding a new mutex. > > The waiter thread can only exit without being cancelled for two > reasons. > 1. When it fails in update_multipath, which removes the device while > holding the vecs lock. > 2. If it receives a SIGUSR2 signal while waiting for a dm event. > > Case 1 can never race with another thread removing the device, since > removing a device always happens while holding the vecs lock. This > means that if the device exists to be removed, then the waiter thread > can't exit this way during the removal. > > Case 2 is now solved by grabbing the new mutex after failing > dm_task_run(). With the mutex held, the thread checks if it has been > cancelled. If it wasn't cancelled, the thread continues. > > The reason that this uses a new mutex, instead of the vecs lock, is > that > using the vecs lock would keep the thread from ending until the vecs > lock was released. Normally, this isn't a problem. But during > reconfigure, the waiter threads for all devices are stopped, and new > ones started, all while holding the vecs lock. For systems with a > large > number of multipath devices, this will cause multipathd do have > double > its already large number of waiter threads during reconfigure, all > locked > into memory. > > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> > --- > multipathd/waiter.c | 16 +++++++++++++--- > 1 file changed, 13 insertions(+), 3 deletions(-) > > diff --git a/multipathd/waiter.c b/multipathd/waiter.c > index cb9708b..e894294 100644 > --- a/multipathd/waiter.c > +++ b/multipathd/waiter.c > @@ -23,6 +23,7 @@ > #include "waiter.h" > > pthread_attr_t waiter_attr; > +struct mutex_lock waiter_lock = { .mutex = PTHREAD_MUTEX_INITIALIZER > }; > > static struct event_thread *alloc_waiter (void) > { > @@ -59,8 +60,12 @@ void stop_waiter_thread (struct multipath *mpp, > struct vectors *vecs) > mpp->waiter); > thread = mpp->waiter; > mpp->waiter = (pthread_t)0; > - pthread_cancel(thread); > + pthread_cleanup_push(cleanup_lock, &waiter_lock); > + lock(&waiter_lock); > + pthread_testcancel(); What's the purpose of this pthread_testcancel() call? > pthread_kill(thread, SIGUSR2); > + pthread_cancel(thread); > + lock_cleanup_pop(&waiter_lock); > } > > /* > @@ -114,8 +119,13 @@ static int waiteventloop (struct event_thread > *waiter) > dm_task_destroy(waiter->dmt); > waiter->dmt = NULL; > > - if (!r) /* wait interrupted by signal */ > - return -1; > + if (!r) { /* wait interrupted by signal. check for > cancellation */ > + pthread_cleanup_push(cleanup_lock, &waiter_lock); > + lock(&waiter_lock); > + pthread_testcancel(); > + lock_cleanup_pop(&waiter_lock); Nitpick: I'd prefer pthread_cleanup_pop(1) here. Regards, Martin > + return 1; /* If we weren't cancelled, just > reschedule */ > + } > > waiter->event_nr++; >
On Mon, Mar 19, 2018 at 01:59:24PM +0100, Martin Wilck wrote: > On Fri, 2018-03-16 at 16:31 -0500, Benjamin Marzinski wrote: > > diff --git a/multipathd/waiter.c b/multipathd/waiter.c > > index cb9708b..e894294 100644 > > --- a/multipathd/waiter.c > > +++ b/multipathd/waiter.c > > @@ -23,6 +23,7 @@ > > #include "waiter.h" > > > > pthread_attr_t waiter_attr; > > +struct mutex_lock waiter_lock = { .mutex = PTHREAD_MUTEX_INITIALIZER > > }; > > > > static struct event_thread *alloc_waiter (void) > > { > > @@ -59,8 +60,12 @@ void stop_waiter_thread (struct multipath *mpp, > > struct vectors *vecs) > > mpp->waiter); > > thread = mpp->waiter; > > mpp->waiter = (pthread_t)0; > > - pthread_cancel(thread); > > + pthread_cleanup_push(cleanup_lock, &waiter_lock); > > + lock(&waiter_lock); > > + pthread_testcancel(); > > What's the purpose of this pthread_testcancel() call? > It's not necessary. I feel like in general, after you get done waiting on a lock, it's good form to see if you've been cancelled while you were waiting. However, it's really unlikely that a thread will be waiting here long, and this pthread_cancel isn't protecting it form accessing any freed data. If you think it's likely to confuse people, I have no problem with removing it. > > pthread_kill(thread, SIGUSR2); > > + pthread_cancel(thread); > > + lock_cleanup_pop(&waiter_lock); > > } > > > > /* > > @@ -114,8 +119,13 @@ static int waiteventloop (struct event_thread > > *waiter) > > dm_task_destroy(waiter->dmt); > > waiter->dmt = NULL; > > > > - if (!r) /* wait interrupted by signal */ > > - return -1; > > + if (!r) { /* wait interrupted by signal. check for > > cancellation */ > > + pthread_cleanup_push(cleanup_lock, &waiter_lock); > > + lock(&waiter_lock); > > + pthread_testcancel(); > > + lock_cleanup_pop(&waiter_lock); > > Nitpick: I'd prefer pthread_cleanup_pop(1) here. I have no problem with changing that. -Ben > Regards, > Martin > > > > + return 1; /* If we weren't cancelled, just > > reschedule */ > > + } > > > > waiter->event_nr++; > > > > -- > Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107 > SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton > HRB 21284 (AG Nürnberg) -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On Mon, 2018-03-19 at 11:25 -0500, Benjamin Marzinski wrote: > On Mon, Mar 19, 2018 at 01:59:24PM +0100, Martin Wilck wrote: > > On Fri, 2018-03-16 at 16:31 -0500, Benjamin Marzinski wrote: > > > diff --git a/multipathd/waiter.c b/multipathd/waiter.c > > > index cb9708b..e894294 100644 > > > --- a/multipathd/waiter.c > > > +++ b/multipathd/waiter.c > > > @@ -23,6 +23,7 @@ > > > #include "waiter.h" > > > > > > pthread_attr_t waiter_attr; > > > +struct mutex_lock waiter_lock = { .mutex = > > > PTHREAD_MUTEX_INITIALIZER > > > }; > > > > > > static struct event_thread *alloc_waiter (void) > > > { > > > @@ -59,8 +60,12 @@ void stop_waiter_thread (struct multipath > > > *mpp, > > > struct vectors *vecs) > > > mpp->waiter); > > > thread = mpp->waiter; > > > mpp->waiter = (pthread_t)0; > > > - pthread_cancel(thread); > > > + pthread_cleanup_push(cleanup_lock, &waiter_lock); > > > + lock(&waiter_lock); > > > + pthread_testcancel(); > > > > What's the purpose of this pthread_testcancel() call? > > > > It's not necessary. I feel like in general, after you get done > waiting > on a lock, it's good form to see if you've been cancelled while you > were > waiting. However, it's really unlikely that a thread will be waiting > here long, and this pthread_cancel isn't protecting it form accessing > any freed data. If you think it's likely to confuse people, I have no > problem with removing it. My thinking was: this thread is just about to cancel another thread. If it's cancelled itself just before doing that, the other one won't be cancelled. I find it hard to imagine conditions where that would be the right thing. So if at all, I'd call pthread_testcancel() after releasing the lock. But probably it's best just to remove it here. Regards, Martin
diff --git a/multipathd/waiter.c b/multipathd/waiter.c index cb9708b..e894294 100644 --- a/multipathd/waiter.c +++ b/multipathd/waiter.c @@ -23,6 +23,7 @@ #include "waiter.h" pthread_attr_t waiter_attr; +struct mutex_lock waiter_lock = { .mutex = PTHREAD_MUTEX_INITIALIZER }; static struct event_thread *alloc_waiter (void) { @@ -59,8 +60,12 @@ void stop_waiter_thread (struct multipath *mpp, struct vectors *vecs) mpp->waiter); thread = mpp->waiter; mpp->waiter = (pthread_t)0; - pthread_cancel(thread); + pthread_cleanup_push(cleanup_lock, &waiter_lock); + lock(&waiter_lock); + pthread_testcancel(); pthread_kill(thread, SIGUSR2); + pthread_cancel(thread); + lock_cleanup_pop(&waiter_lock); } /* @@ -114,8 +119,13 @@ static int waiteventloop (struct event_thread *waiter) dm_task_destroy(waiter->dmt); waiter->dmt = NULL; - if (!r) /* wait interrupted by signal */ - return -1; + if (!r) { /* wait interrupted by signal. check for cancellation */ + pthread_cleanup_push(cleanup_lock, &waiter_lock); + lock(&waiter_lock); + pthread_testcancel(); + lock_cleanup_pop(&waiter_lock); + return 1; /* If we weren't cancelled, just reschedule */ + } waiter->event_nr++;
multipathd was sending a signal to the per-device waiter thread after cancelling it. Since these threads are detached, there is a window where a new thread could have started with the old thread id between the cancel and the signalling, causing the signal to be delivered to the wrong thread. Simply reversing the order doesn't fix the issue, since the waiter threads exit immediately if they receive a signal, again opening a window for the cancel to be delivered to the wrong thread. To fix this, multipathd does reverse the order, so that it signals the thread first (it needs to signal the thread, since the dm_task_run ioctl isn't a cancellation point) and then cancels it. However it does this while holding a new mutex. The waiter thread can only exit without being cancelled for two reasons. 1. When it fails in update_multipath, which removes the device while holding the vecs lock. 2. If it receives a SIGUSR2 signal while waiting for a dm event. Case 1 can never race with another thread removing the device, since removing a device always happens while holding the vecs lock. This means that if the device exists to be removed, then the waiter thread can't exit this way during the removal. Case 2 is now solved by grabbing the new mutex after failing dm_task_run(). With the mutex held, the thread checks if it has been cancelled. If it wasn't cancelled, the thread continues. The reason that this uses a new mutex, instead of the vecs lock, is that using the vecs lock would keep the thread from ending until the vecs lock was released. Normally, this isn't a problem. But during reconfigure, the waiter threads for all devices are stopped, and new ones started, all while holding the vecs lock. For systems with a large number of multipath devices, this will cause multipathd do have double its already large number of waiter threads during reconfigure, all locked into memory. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> --- multipathd/waiter.c | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-)