diff mbox series

[2/3] nbd: unify behavior in timeout no matter how many sockets is configured

Message ID 20200921105718.29006-3-houpu@bytedance.com (mailing list archive)
State New, archived
Headers show
Series nbd: improve timeout handling and a fix | expand

Commit Message

Hou Pu Sept. 21, 2020, 10:57 a.m. UTC
When an nbd device is configured with multiple sockets, the request
is requeued to an active socket in xmit_timeout, the original socket
is closed if not.

Some time, the backend nbd server hang, thus all sockets will be
dropped and the nbd device is not usable. It would be better to have an
option to wait for more time (just reset timer in nbd_xmit_timeout).
Like what we do if we only have one socket. This patch allows it.

Signed-off-by: Hou Pu <houpu@bytedance.com>
---
 drivers/block/nbd.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

Comments

Josef Bacik Sept. 21, 2020, 6:01 p.m. UTC | #1
On 9/21/20 6:57 AM, Hou Pu wrote:
> When an nbd device is configured with multiple sockets, the request
> is requeued to an active socket in xmit_timeout, the original socket
> is closed if not.
> 
> Some time, the backend nbd server hang, thus all sockets will be
> dropped and the nbd device is not usable. It would be better to have an
> option to wait for more time (just reset timer in nbd_xmit_timeout).
> Like what we do if we only have one socket. This patch allows it.
> 
> Signed-off-by: Hou Pu <houpu@bytedance.com>
> ---
>   drivers/block/nbd.c | 7 ++-----
>   1 file changed, 2 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> index 538e9dcf5bf2..4c0bbb981cbc 100644
> --- a/drivers/block/nbd.c
> +++ b/drivers/block/nbd.c
> @@ -400,8 +400,7 @@ static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req,
>   	    nbd->tag_set.timeout)
>   		goto error_out;
>   
> -	if (config->num_connections > 1 ||
> -	    (config->num_connections == 1 && nbd->tag_set.timeout)) {
> +	if (nbd->tag_set.timeout) {

So now if you don't set a .timeout and have num_connections > 1 we don't close 
the socket, so this won't work.  Thanks,

Josef
Hou Pu Sept. 22, 2020, 3:14 a.m. UTC | #2
On 2020/9/22 2:01 AM, Josef Bacik wrote:
> On 9/21/20 6:57 AM, Hou Pu wrote:
>> When an nbd device is configured with multiple sockets, the request
>> is requeued to an active socket in xmit_timeout, the original socket
>> is closed if not.
>>
>> Some time, the backend nbd server hang, thus all sockets will be
>> dropped and the nbd device is not usable. It would be better to have an
>> option to wait for more time (just reset timer in nbd_xmit_timeout).
>> Like what we do if we only have one socket. This patch allows it.
>>
>> Signed-off-by: Hou Pu <houpu@bytedance.com>
>> ---
>>   drivers/block/nbd.c | 7 ++-----
>>   1 file changed, 2 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
>> index 538e9dcf5bf2..4c0bbb981cbc 100644
>> --- a/drivers/block/nbd.c
>> +++ b/drivers/block/nbd.c
>> @@ -400,8 +400,7 @@ static enum blk_eh_timer_return 
>> nbd_xmit_timeout(struct request *req,
>>           nbd->tag_set.timeout)
>>           goto error_out;
>> -    if (config->num_connections > 1 ||
>> -        (config->num_connections == 1 && nbd->tag_set.timeout)) {
>> +    if (nbd->tag_set.timeout) {
> 
> So now if you don't set a .timeout and have num_connections > 1 we don't 
> close the socket, so this won't work.  Thanks,
> 
> Josef

OK, will keep the behavior same as before when we don't set a .timeout
and num_connections > 1. And there is not need to keep this patch.
Will remove it in v2.

Thanks,
Hou
diff mbox series

Patch

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 538e9dcf5bf2..4c0bbb981cbc 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -400,8 +400,7 @@  static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req,
 	    nbd->tag_set.timeout)
 		goto error_out;
 
-	if (config->num_connections > 1 ||
-	    (config->num_connections == 1 && nbd->tag_set.timeout)) {
+	if (nbd->tag_set.timeout) {
 		dev_err_ratelimited(nbd_to_dev(nbd),
 				    "Connection timed out, retrying (%d/%d alive)\n",
 				    atomic_read(&config->live_connections),
@@ -432,9 +431,7 @@  static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req,
 			nbd_config_put(nbd);
 			return BLK_EH_DONE;
 		}
-	}
-
-	if (!nbd->tag_set.timeout) {
+	} else {
 		/*
 		 * Userspace sets timeout=0 to disable socket disconnection,
 		 * so just warn and reset the timer.