diff mbox series

[15/35] multipathd: uxlsnr: avoid stalled clients during reconfigure

Message ID 20210910114120.13665-16-mwilck@suse.com (mailing list archive)
State Not Applicable, archived
Delegated to: christophe varoqui
Headers show
Series multipathd: uxlsnr overhaul | expand

Commit Message

Martin Wilck Sept. 10, 2021, 11:41 a.m. UTC
From: Martin Wilck <mwilck@suse.com>

Since 47cc1d3 ("multipathd: fix client response for socket
activation"), we hold back clients while reconfigure is running.
The idea of 47cc1d3 was to fix the behavior during initial
start up. When multipathd reconfigures itself during runtime,
and the reconfiguration takes a long time (a minute or more is
not unusual in big configurations), clients will time out with
no response ("timeout receiving packet"). Waiting for reconfigure
to finish breaks our timeout handling.

Therefore we should only apply the logic of 47cc1d3 during initial
configuration. In this case, the client that triggered socket
activation may still encounter a timeout, but there's not much we can
do about that.

Fixes: 47cc1d3 ("multipathd: fix client response for socket activation")
Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 multipathd/main.c   |  9 +++++++++
 multipathd/uxlsnr.c | 12 ------------
 2 files changed, 9 insertions(+), 12 deletions(-)

Comments

Benjamin Marzinski Sept. 16, 2021, 2:17 a.m. UTC | #1
On Fri, Sep 10, 2021 at 01:41:00PM +0200, mwilck@suse.com wrote:
> From: Martin Wilck <mwilck@suse.com>
> 
> Since 47cc1d3 ("multipathd: fix client response for socket
> activation"), we hold back clients while reconfigure is running.
> The idea of 47cc1d3 was to fix the behavior during initial
> start up. When multipathd reconfigures itself during runtime,
> and the reconfiguration takes a long time (a minute or more is
> not unusual in big configurations), clients will time out with
> no response ("timeout receiving packet"). Waiting for reconfigure
> to finish breaks our timeout handling.
> 
> Therefore we should only apply the logic of 47cc1d3 during initial
> configuration. In this case, the client that triggered socket
> activation may still encounter a timeout, but there's not much we can
> do about that.
> 
> Fixes: 47cc1d3 ("multipathd: fix client response for socket activation")
Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com>
> Signed-off-by: Martin Wilck <mwilck@suse.com>
> ---
>  multipathd/main.c   |  9 +++++++++
>  multipathd/uxlsnr.c | 12 ------------
>  2 files changed, 9 insertions(+), 12 deletions(-)
> 
> diff --git a/multipathd/main.c b/multipathd/main.c
> index 6d7c8c9..c6357ef 100644
> --- a/multipathd/main.c
> +++ b/multipathd/main.c
> @@ -1695,6 +1695,15 @@ uxlsnrloop (void * ap)
>  
>  	init_handler_callbacks();
>  	umask(077);
> +
> +	/*
> +	 * Wait for initial reconfiguration to finish, while
> +	 * hadling signals
> +	 */
> +	while (wait_for_state_change_if(DAEMON_CONFIGURE, 50)
> +	       == DAEMON_CONFIGURE)
> +		handle_signals(false);
> +
>  	uxsock_listen(&uxsock_trigger, ux_sock, ap);
>  
>  out_sock:
> diff --git a/multipathd/uxlsnr.c b/multipathd/uxlsnr.c
> index dbee0d6..20efbd3 100644
> --- a/multipathd/uxlsnr.c
> +++ b/multipathd/uxlsnr.c
> @@ -391,18 +391,6 @@ void * uxsock_listen(uxsock_trigger_fn uxsock_trigger, long ux_sock,
>  			continue;
>  		}
>  
> -		/*
> -		 * Client connection. We shouldn't answer while we're
> -		 * configuring - nothing may be configured yet.
> -		 * But we can't wait forever either, because this thread
> -		 * must handle signals. So wait a short while only.
> -		 */
> -		if (wait_for_state_change_if(DAEMON_CONFIGURE, 10)
> -		    == DAEMON_CONFIGURE) {
> -			handle_signals(false);
> -			continue;
> -		}
> -
>  		/* see if a client wants to speak to us */
>  		for (i = POLLFDS_BASE; i < n_pfds; i++) {
>  			if (polls[i].revents & POLLIN) {
> -- 
> 2.33.0

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel
diff mbox series

Patch

diff --git a/multipathd/main.c b/multipathd/main.c
index 6d7c8c9..c6357ef 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -1695,6 +1695,15 @@  uxlsnrloop (void * ap)
 
 	init_handler_callbacks();
 	umask(077);
+
+	/*
+	 * Wait for initial reconfiguration to finish, while
+	 * hadling signals
+	 */
+	while (wait_for_state_change_if(DAEMON_CONFIGURE, 50)
+	       == DAEMON_CONFIGURE)
+		handle_signals(false);
+
 	uxsock_listen(&uxsock_trigger, ux_sock, ap);
 
 out_sock:
diff --git a/multipathd/uxlsnr.c b/multipathd/uxlsnr.c
index dbee0d6..20efbd3 100644
--- a/multipathd/uxlsnr.c
+++ b/multipathd/uxlsnr.c
@@ -391,18 +391,6 @@  void * uxsock_listen(uxsock_trigger_fn uxsock_trigger, long ux_sock,
 			continue;
 		}
 
-		/*
-		 * Client connection. We shouldn't answer while we're
-		 * configuring - nothing may be configured yet.
-		 * But we can't wait forever either, because this thread
-		 * must handle signals. So wait a short while only.
-		 */
-		if (wait_for_state_change_if(DAEMON_CONFIGURE, 10)
-		    == DAEMON_CONFIGURE) {
-			handle_signals(false);
-			continue;
-		}
-
 		/* see if a client wants to speak to us */
 		for (i = POLLFDS_BASE; i < n_pfds; i++) {
 			if (polls[i].revents & POLLIN) {