diff mbox

[v2,23/23] multipathd: fix signal blocking logic

Message ID 20180305231507.10386-24-mwilck@suse.com (mailing list archive)
State Not Applicable, archived
Delegated to: christophe varoqui
Headers show

Commit Message

Martin Wilck March 5, 2018, 11:15 p.m. UTC
multipathd is supposed to block all signals in all threads, except
the uxlsnr thread which handles termination and reconfiguration
signals (SIGUSR1) in its ppoll() call, SIGUSR2 in the waiter thread
and the marginal path checker thread, and occasional SIGALRM. The current
logic does exactly the oppsite, it blocks termination signals in SIGPOLL and
allows multipathd to be killed e.g. by SIGALRM.

Fix that by inverting the logic. The argument to pthread_sigmask and
ppoll is the set of *blocked* signals, not vice versa.

The marginal paths code needs to unblock SIGUSR2 now explicity, as
the dm-event waiter code already does. Doing this with pselect()
avoids asynchronous cancellation.

Fixes: 810082e "libmultipath, multipathd: Rework SIGPIPE handling"
Fixes: 534ec4c "multipathd: Ensure that SIGINT, SIGTERM, SIGHUP and SIGUSR1
are delivered to the uxsock thread"

Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 libmultipath/io_err_stat.c | 17 ++++++++++++++++-
 multipathd/main.c          |  7 +++++--
 multipathd/uxlsnr.c        | 10 +++++-----
 3 files changed, 26 insertions(+), 8 deletions(-)

Comments

Hannes Reinecke March 6, 2018, 7:16 a.m. UTC | #1
On 03/06/2018 12:15 AM, Martin Wilck wrote:
> multipathd is supposed to block all signals in all threads, except
> the uxlsnr thread which handles termination and reconfiguration
> signals (SIGUSR1) in its ppoll() call, SIGUSR2 in the waiter thread
> and the marginal path checker thread, and occasional SIGALRM. The current
> logic does exactly the oppsite, it blocks termination signals in SIGPOLL and
> allows multipathd to be killed e.g. by SIGALRM.
> 
> Fix that by inverting the logic. The argument to pthread_sigmask and
> ppoll is the set of *blocked* signals, not vice versa.
> 
> The marginal paths code needs to unblock SIGUSR2 now explicity, as
> the dm-event waiter code already does. Doing this with pselect()
> avoids asynchronous cancellation.
> 
> Fixes: 810082e "libmultipath, multipathd: Rework SIGPIPE handling"
> Fixes: 534ec4c "multipathd: Ensure that SIGINT, SIGTERM, SIGHUP and SIGUSR1
> are delivered to the uxsock thread"
> 
> Signed-off-by: Martin Wilck <mwilck@suse.com>
> ---
>  libmultipath/io_err_stat.c | 17 ++++++++++++++++-
>  multipathd/main.c          |  7 +++++--
>  multipathd/uxlsnr.c        | 10 +++++-----
>  3 files changed, 26 insertions(+), 8 deletions(-)
> 
Sigh.
Will we ever get signal handling correct?

Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes
Martin Wilck March 6, 2018, 8:55 a.m. UTC | #2
On Tue, 2018-03-06 at 08:16 +0100, Hannes Reinecke wrote:
> On 03/06/2018 12:15 AM, Martin Wilck wrote:
> > multipathd is supposed to block all signals in all threads, except
> > the uxlsnr thread which handles termination and reconfiguration
> > signals (SIGUSR1) in its ppoll() call, SIGUSR2 in the waiter thread
> > and the marginal path checker thread, and occasional SIGALRM. The
> > current
> > logic does exactly the oppsite, it blocks termination signals in
> > SIGPOLL and
> > allows multipathd to be killed e.g. by SIGALRM.
> > 
> > Fix that by inverting the logic. The argument to pthread_sigmask
> > and
> > ppoll is the set of *blocked* signals, not vice versa.
> > 
> > The marginal paths code needs to unblock SIGUSR2 now explicity, as
> > the dm-event waiter code already does. Doing this with pselect()
> > avoids asynchronous cancellation.
> > 
> > Fixes: 810082e "libmultipath, multipathd: Rework SIGPIPE handling"
> > Fixes: 534ec4c "multipathd: Ensure that SIGINT, SIGTERM, SIGHUP and
> > SIGUSR1
> > are delivered to the uxsock thread"
> > 
> > Signed-off-by: Martin Wilck <mwilck@suse.com>
> > ---
> >  libmultipath/io_err_stat.c | 17 ++++++++++++++++-
> >  multipathd/main.c          |  7 +++++--
> >  multipathd/uxlsnr.c        | 10 +++++-----
> >  3 files changed, 26 insertions(+), 8 deletions(-)
> > 
> 
> Sigh.
> Will we ever get signal handling correct?

I'm quite confident that we're close now.
But time will tell.

Martin
Benjamin Marzinski March 7, 2018, 8:24 p.m. UTC | #3
On Tue, Mar 06, 2018 at 12:15:07AM +0100, Martin Wilck wrote:
> multipathd is supposed to block all signals in all threads, except
> the uxlsnr thread which handles termination and reconfiguration
> signals (SIGUSR1) in its ppoll() call, SIGUSR2 in the waiter thread
> and the marginal path checker thread, and occasional SIGALRM. The current
> logic does exactly the oppsite, it blocks termination signals in SIGPOLL and
> allows multipathd to be killed e.g. by SIGALRM.
> 
> Fix that by inverting the logic. The argument to pthread_sigmask and
> ppoll is the set of *blocked* signals, not vice versa.
> 
> The marginal paths code needs to unblock SIGUSR2 now explicity, as
> the dm-event waiter code already does. Doing this with pselect()
> avoids asynchronous cancellation.
> 
> Fixes: 810082e "libmultipath, multipathd: Rework SIGPIPE handling"
> Fixes: 534ec4c "multipathd: Ensure that SIGINT, SIGTERM, SIGHUP and SIGUSR1
> are delivered to the uxsock thread"
> 

Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com>

> Signed-off-by: Martin Wilck <mwilck@suse.com>
> ---
>  libmultipath/io_err_stat.c | 17 ++++++++++++++++-
>  multipathd/main.c          |  7 +++++--
>  multipathd/uxlsnr.c        | 10 +++++-----
>  3 files changed, 26 insertions(+), 8 deletions(-)
> 
> diff --git a/libmultipath/io_err_stat.c b/libmultipath/io_err_stat.c
> index 5b10f0372f9f..00bac9e0e755 100644
> --- a/libmultipath/io_err_stat.c
> +++ b/libmultipath/io_err_stat.c
> @@ -21,6 +21,7 @@
>  #include <libaio.h>
>  #include <errno.h>
>  #include <sys/mman.h>
> +#include <sys/select.h>
>  
>  #include "vector.h"
>  #include "memory.h"
> @@ -691,14 +692,28 @@ static void service_paths(void)
>  
>  static void *io_err_stat_loop(void *data)
>  {
> +	sigset_t set;
> +
>  	vecs = (struct vectors *)data;
>  	pthread_cleanup_push(rcu_unregister, NULL);
>  	rcu_register_thread();
>  
> +	sigfillset(&set);
> +	sigdelset(&set, SIGUSR2);
> +
>  	mlockall(MCL_CURRENT | MCL_FUTURE);
>  	while (1) {
> +		struct timespec ts;
> +
>  		service_paths();
> -		usleep(100000);
> +
> +		ts.tv_sec = 0;
> +		ts.tv_nsec = 100 * 1000 * 1000;
> +		/*
> +		 * pselect() with no fds, a timeout, and a sigmask:
> +		 * sleep for 100ms and react on SIGUSR2.
> +		 */
> +		pselect(1, NULL, NULL, NULL, &ts, &set);
>  	}
>  
>  	pthread_cleanup_pop(1);
> diff --git a/multipathd/main.c b/multipathd/main.c
> index 4abdd8f071c3..68595836d723 100644
> --- a/multipathd/main.c
> +++ b/multipathd/main.c
> @@ -2261,10 +2261,13 @@ signal_init(void)
>  {
>  	sigset_t set;
>  
> -	sigemptyset(&set);
> -	sigaddset(&set, SIGUSR2);
> +	/* block all signals */
> +	sigfillset(&set);
> +	/* SIGPIPE occurs if logging fails */
> +	sigdelset(&set, SIGPIPE);
>  	pthread_sigmask(SIG_SETMASK, &set, NULL);
>  
> +	/* Other signals will be unblocked in the uxlsnr thread */
>  	signal_set(SIGHUP, sighup);
>  	signal_set(SIGUSR1, sigusr1);
>  	signal_set(SIGUSR2, sigusr2);
> diff --git a/multipathd/uxlsnr.c b/multipathd/uxlsnr.c
> index 98ac25a68c43..a2ca36ba1653 100644
> --- a/multipathd/uxlsnr.c
> +++ b/multipathd/uxlsnr.c
> @@ -170,11 +170,11 @@ void * uxsock_listen(uxsock_trigger_fn uxsock_trigger, void * trigger_data)
>  		condlog(0, "uxsock: failed to allocate poll fds");
>  		return NULL;
>  	}
> -	sigemptyset(&mask);
> -	sigaddset(&mask, SIGINT);
> -	sigaddset(&mask, SIGTERM);
> -	sigaddset(&mask, SIGHUP);
> -	sigaddset(&mask, SIGUSR1);
> +	sigfillset(&mask);
> +	sigdelset(&mask, SIGINT);
> +	sigdelset(&mask, SIGTERM);
> +	sigdelset(&mask, SIGHUP);
> +	sigdelset(&mask, SIGUSR1);
>  	while (1) {
>  		struct client *c, *tmp;
>  		int i, poll_count, num_clients;
> -- 
> 2.16.1

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
diff mbox

Patch

diff --git a/libmultipath/io_err_stat.c b/libmultipath/io_err_stat.c
index 5b10f0372f9f..00bac9e0e755 100644
--- a/libmultipath/io_err_stat.c
+++ b/libmultipath/io_err_stat.c
@@ -21,6 +21,7 @@ 
 #include <libaio.h>
 #include <errno.h>
 #include <sys/mman.h>
+#include <sys/select.h>
 
 #include "vector.h"
 #include "memory.h"
@@ -691,14 +692,28 @@  static void service_paths(void)
 
 static void *io_err_stat_loop(void *data)
 {
+	sigset_t set;
+
 	vecs = (struct vectors *)data;
 	pthread_cleanup_push(rcu_unregister, NULL);
 	rcu_register_thread();
 
+	sigfillset(&set);
+	sigdelset(&set, SIGUSR2);
+
 	mlockall(MCL_CURRENT | MCL_FUTURE);
 	while (1) {
+		struct timespec ts;
+
 		service_paths();
-		usleep(100000);
+
+		ts.tv_sec = 0;
+		ts.tv_nsec = 100 * 1000 * 1000;
+		/*
+		 * pselect() with no fds, a timeout, and a sigmask:
+		 * sleep for 100ms and react on SIGUSR2.
+		 */
+		pselect(1, NULL, NULL, NULL, &ts, &set);
 	}
 
 	pthread_cleanup_pop(1);
diff --git a/multipathd/main.c b/multipathd/main.c
index 4abdd8f071c3..68595836d723 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -2261,10 +2261,13 @@  signal_init(void)
 {
 	sigset_t set;
 
-	sigemptyset(&set);
-	sigaddset(&set, SIGUSR2);
+	/* block all signals */
+	sigfillset(&set);
+	/* SIGPIPE occurs if logging fails */
+	sigdelset(&set, SIGPIPE);
 	pthread_sigmask(SIG_SETMASK, &set, NULL);
 
+	/* Other signals will be unblocked in the uxlsnr thread */
 	signal_set(SIGHUP, sighup);
 	signal_set(SIGUSR1, sigusr1);
 	signal_set(SIGUSR2, sigusr2);
diff --git a/multipathd/uxlsnr.c b/multipathd/uxlsnr.c
index 98ac25a68c43..a2ca36ba1653 100644
--- a/multipathd/uxlsnr.c
+++ b/multipathd/uxlsnr.c
@@ -170,11 +170,11 @@  void * uxsock_listen(uxsock_trigger_fn uxsock_trigger, void * trigger_data)
 		condlog(0, "uxsock: failed to allocate poll fds");
 		return NULL;
 	}
-	sigemptyset(&mask);
-	sigaddset(&mask, SIGINT);
-	sigaddset(&mask, SIGTERM);
-	sigaddset(&mask, SIGHUP);
-	sigaddset(&mask, SIGUSR1);
+	sigfillset(&mask);
+	sigdelset(&mask, SIGINT);
+	sigdelset(&mask, SIGTERM);
+	sigdelset(&mask, SIGHUP);
+	sigdelset(&mask, SIGUSR1);
 	while (1) {
 		struct client *c, *tmp;
 		int i, poll_count, num_clients;