diff mbox

[6/6] multipathd: Remove a busy-waiting loop

Message ID c7ce9f1d-861e-a9a8-443f-8f921ce742df@sandisk.com (mailing list archive)
State Not Applicable, archived
Delegated to: christophe varoqui
Headers show

Commit Message

Bart Van Assche Aug. 17, 2016, 7:57 p.m. UTC
On 08/17/2016 12:36 PM, Dragan Stancevic wrote:
> Acked-by: dragan.stancevic@canonical.com
> <mailto:dragan.stancevic@canonical.com>
>
> I agree with your patch, I have been tracking an issue where multipathd
> dumps core on the exit path just past the treads being canceled. Your
> patch is very similar to mine (minus nuking the depth) that I was going
> to send out to a user to test with. The checker thread accesses a valid
> pointer with garbage values....

Hello Dragan,

When I prepared my patch I overlooked that multipathd creates most 
threads as detached threads. So I started testing the three attached 
patches on my setup. It would be appreciated if you could have a look at 
these patches.

Thanks,

Bart.
--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Comments

Dragan Stancevic Aug. 18, 2016, 8:54 p.m. UTC | #1
Hi Bart-

it looks ok to me, these three patches combined with your previous patch
that is already in the repo...


Acked-by: dragan.stancevic@canonical.com

On Wed, Aug 17, 2016 at 2:57 PM, Bart Van Assche <bart.vanassche@sandisk.com
> wrote:

> On 08/17/2016 12:36 PM, Dragan Stancevic wrote:
>
>> Acked-by: dragan.stancevic@canonical.com
>> <mailto:dragan.stancevic@canonical.com>
>>
>> I agree with your patch, I have been tracking an issue where multipathd
>> dumps core on the exit path just past the treads being canceled. Your
>> patch is very similar to mine (minus nuking the depth) that I was going
>> to send out to a user to test with. The checker thread accesses a valid
>> pointer with garbage values....
>>
>
> Hello Dragan,
>
> When I prepared my patch I overlooked that multipathd creates most threads
> as detached threads. So I started testing the three attached patches on my
> setup. It would be appreciated if you could have a look at these patches.
>
> Thanks,
>
> Bart.
>
--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Bart Van Assche Aug. 18, 2016, 10:42 p.m. UTC | #2
On 08/18/2016 01:54 PM, Dragan Stancevic wrote:
> it looks ok to me, these three patches combined with your previous patch
> that is already in the repo...

Hello Dan,

Thanks for the feedback. I will send the first patch to Christophe but I 
would like to give the second and third patches a bit more thought.

Bart.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
diff mbox

Patch

>From e1144d5321f87bc72e6fb29b40cf7728125aa926 Mon Sep 17 00:00:00 2001
From: Bart Van Assche <bart.vanassche@sandisk.com>
Date: Wed, 17 Aug 2016 09:58:09 -0700
Subject: [PATCH 3/3] libmultipath/checkers/tur: Fix race related to thread
 termination

Since the tur thread is a detached thread, all data structures
associated with that thread are freed once the thread stops. Avoid
that pthread_cancel() triggers a use-after-free by protecting
pthread_cancel() calls with the same spinlock that protects ct->thread.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
---
 libmultipath/checkers/tur.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/libmultipath/checkers/tur.c b/libmultipath/checkers/tur.c
index c014b65..f724350 100644
--- a/libmultipath/checkers/tur.c
+++ b/libmultipath/checkers/tur.c
@@ -3,6 +3,7 @@ 
  *
  * Copyright (c) 2004 Christophe Varoqui
  */
+#include <assert.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
@@ -98,10 +99,11 @@  void libcheck_free (struct checker * c)
 		ct->holders--;
 		holders = ct->holders;
 		thread = ct->thread;
-		pthread_spin_unlock(&ct->hldr_lock);
 		if (holders)
 			pthread_cancel(thread);
-		else
+		pthread_spin_unlock(&ct->hldr_lock);
+
+		if (!holders)
 			cleanup_context(ct);
 		c->context = NULL;
 	}
@@ -207,6 +209,7 @@  void cleanup_func(void *data)
 	int holders;
 	struct tur_checker_context *ct = data;
 	pthread_spin_lock(&ct->hldr_lock);
+	assert(ct->holders > 0);
 	ct->holders--;
 	holders = ct->holders;
 	ct->thread = 0;
@@ -310,7 +313,12 @@  libcheck_check (struct checker * c)
 			if (tur_check_async_timeout(c)) {
 				condlog(3, "%d:%d: tur checker timeout",
 					TUR_DEVT(ct));
-				pthread_cancel(ct->thread);
+
+				pthread_spin_lock(&ct->hldr_lock);
+				if (ct->holders > 0)
+					pthread_cancel(ct->thread);
+				pthread_spin_unlock(&ct->hldr_lock);
+
 				ct->running = 0;
 				MSG(c, MSG_TUR_TIMEOUT);
 				tur_status = PATH_TIMEOUT;
@@ -349,9 +357,9 @@  libcheck_check (struct checker * c)
 		if (r) {
 			pthread_spin_lock(&ct->hldr_lock);
 			ct->holders--;
+			ct->thread = 0;
 			pthread_spin_unlock(&ct->hldr_lock);
 			pthread_mutex_unlock(&ct->lock);
-			ct->thread = 0;
 			condlog(3, "%d:%d: failed to start tur thread, using"
 				" sync mode", TUR_DEVT(ct));
 			return tur_check(c->fd, c->timeout, c->message);
-- 
2.9.2