Message ID | 20200810120044.2152-1-houpu@bytedance.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | nbd: restore default timeout when setting it to zero | expand |
On 8/10/20 8:00 AM, Hou Pu wrote: > If we configured io timeout of nbd0 to 100s. Later after we > finished using it, we configured nbd0 again and set the io > timeout to 0. We expect it would timeout after 30 seconds > and keep retry. But in fact we could not change the timeout > when we set it to 0. the timeout is still the original 100s. > > So change the timeout to default 30s when we set it to zero. > It also behaves same as commit 2da22da57348 ("nbd: fix zero > cmd timeout handling v2"). > > It becomes more important if we were reconfigure a nbd device > and the io timeout it set to zero. Because it could take 30s > to detect the new socket and thus io could be completed more > quickly compared to 100s. > > Signed-off-by: Hou Pu <houpu@bytedance.com> > --- > drivers/block/nbd.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c > index ce7e9f223b20..bc9dc1f847e1 100644 > --- a/drivers/block/nbd.c > +++ b/drivers/block/nbd.c > @@ -1360,6 +1360,8 @@ static void nbd_set_cmd_timeout(struct nbd_device *nbd, u64 timeout) > nbd->tag_set.timeout = timeout * HZ; > if (timeout) > blk_queue_rq_timeout(nbd->disk->queue, timeout * HZ); > + else > + blk_queue_rq_timeout(nbd->disk->queue, 30 * HZ); > } > > /* Must be called with config_lock held */ > What about the tag_set.timeout? Thanks, Josef
On 2020/8/21 3:03 AM, Josef Bacik wrote: > On 8/10/20 8:00 AM, Hou Pu wrote: >> If we configured io timeout of nbd0 to 100s. Later after we >> finished using it, we configured nbd0 again and set the io >> timeout to 0. We expect it would timeout after 30 seconds >> and keep retry. But in fact we could not change the timeout >> when we set it to 0. the timeout is still the original 100s. >> >> So change the timeout to default 30s when we set it to zero. >> It also behaves same as commit 2da22da57348 ("nbd: fix zero >> cmd timeout handling v2"). >> >> It becomes more important if we were reconfigure a nbd device >> and the io timeout it set to zero. Because it could take 30s >> to detect the new socket and thus io could be completed more >> quickly compared to 100s. >> >> Signed-off-by: Hou Pu <houpu@bytedance.com> >> --- >> drivers/block/nbd.c | 2 ++ >> 1 file changed, 2 insertions(+) >> >> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c >> index ce7e9f223b20..bc9dc1f847e1 100644 >> --- a/drivers/block/nbd.c >> +++ b/drivers/block/nbd.c >> @@ -1360,6 +1360,8 @@ static void nbd_set_cmd_timeout(struct >> nbd_device *nbd, u64 timeout) >> nbd->tag_set.timeout = timeout * HZ; >> if (timeout) >> blk_queue_rq_timeout(nbd->disk->queue, timeout * HZ); >> + else >> + blk_queue_rq_timeout(nbd->disk->queue, 30 * HZ); >> } >> /* Must be called with config_lock held */ >> > > What about the tag_set.timeout? Thanks, I think user space could set io timeout to 0, thus we set tag_set.timeout = 0 here and also we should tell the block layer to restore 30s timeout in case it is not. tag_set.timeout == 0 imply 30s io timeout and retrying after timeout. (Sorry, I am not sure if I understand your question here. Could you explain a little more if needed?) Thanks, Hou > > Josef
On 8/21/20 3:21 AM, Hou Pu wrote: > > > On 2020/8/21 3:03 AM, Josef Bacik wrote: >> On 8/10/20 8:00 AM, Hou Pu wrote: >>> If we configured io timeout of nbd0 to 100s. Later after we >>> finished using it, we configured nbd0 again and set the io >>> timeout to 0. We expect it would timeout after 30 seconds >>> and keep retry. But in fact we could not change the timeout >>> when we set it to 0. the timeout is still the original 100s. >>> >>> So change the timeout to default 30s when we set it to zero. >>> It also behaves same as commit 2da22da57348 ("nbd: fix zero >>> cmd timeout handling v2"). >>> >>> It becomes more important if we were reconfigure a nbd device >>> and the io timeout it set to zero. Because it could take 30s >>> to detect the new socket and thus io could be completed more >>> quickly compared to 100s. >>> >>> Signed-off-by: Hou Pu <houpu@bytedance.com> >>> --- >>> drivers/block/nbd.c | 2 ++ >>> 1 file changed, 2 insertions(+) >>> >>> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c >>> index ce7e9f223b20..bc9dc1f847e1 100644 >>> --- a/drivers/block/nbd.c >>> +++ b/drivers/block/nbd.c >>> @@ -1360,6 +1360,8 @@ static void nbd_set_cmd_timeout(struct >>> nbd_device *nbd, u64 timeout) >>> nbd->tag_set.timeout = timeout * HZ; >>> if (timeout) >>> blk_queue_rq_timeout(nbd->disk->queue, timeout * HZ); >>> + else >>> + blk_queue_rq_timeout(nbd->disk->queue, 30 * HZ); >>> } >>> /* Must be called with config_lock held */ >>> >> >> What about the tag_set.timeout? Thanks, > > I think user space could set io timeout to 0, thus we set > tag_set.timeout = 0 here and also we should tell the block layer > to restore 30s timeout in case it is not. tag_set.timeout == 0 > imply 30s io timeout and retrying after timeout. > > (Sorry, I am not sure if I understand your question here. Could > you explain a little more if needed?) > I misunderstood what I was using the tagset timeout for. We don't want this here, if we're dropping a config for an nbd device and we want to reset it to defaults then we need to add this to nbd_config_put(). Thanks, Josef
On 2020/8/21 9:57 PM, Josef Bacik wrote: > On 8/21/20 3:21 AM, Hou Pu wrote: >> >> >> On 2020/8/21 3:03 AM, Josef Bacik wrote: >>> On 8/10/20 8:00 AM, Hou Pu wrote: >>>> If we configured io timeout of nbd0 to 100s. Later after we >>>> finished using it, we configured nbd0 again and set the io >>>> timeout to 0. We expect it would timeout after 30 seconds >>>> and keep retry. But in fact we could not change the timeout >>>> when we set it to 0. the timeout is still the original 100s. >>>> >>>> So change the timeout to default 30s when we set it to zero. >>>> It also behaves same as commit 2da22da57348 ("nbd: fix zero >>>> cmd timeout handling v2"). >>>> >>>> It becomes more important if we were reconfigure a nbd device >>>> and the io timeout it set to zero. Because it could take 30s >>>> to detect the new socket and thus io could be completed more >>>> quickly compared to 100s. >>>> >>>> Signed-off-by: Hou Pu <houpu@bytedance.com> >>>> --- >>>> drivers/block/nbd.c | 2 ++ >>>> 1 file changed, 2 insertions(+) >>>> >>>> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c >>>> index ce7e9f223b20..bc9dc1f847e1 100644 >>>> --- a/drivers/block/nbd.c >>>> +++ b/drivers/block/nbd.c >>>> @@ -1360,6 +1360,8 @@ static void nbd_set_cmd_timeout(struct >>>> nbd_device *nbd, u64 timeout) >>>> nbd->tag_set.timeout = timeout * HZ; >>>> if (timeout) >>>> blk_queue_rq_timeout(nbd->disk->queue, timeout * HZ); >>>> + else >>>> + blk_queue_rq_timeout(nbd->disk->queue, 30 * HZ); >>>> } >>>> /* Must be called with config_lock held */ >>>> >>> >>> What about the tag_set.timeout? Thanks, >> >> I think user space could set io timeout to 0, thus we set >> tag_set.timeout = 0 here and also we should tell the block layer >> to restore 30s timeout in case it is not. tag_set.timeout == 0 >> imply 30s io timeout and retrying after timeout. >> >> (Sorry, I am not sure if I understand your question here. Could >> you explain a little more if needed?) >> > > I misunderstood what I was using the tagset timeout for. We don't want > this here, if we're dropping a config for an nbd device and we want to > reset it to defaults then we need to add this to nbd_config_put(). Thanks, AFAIK If we killed a nbd server, then restarted it and reconfigured the nbd socket, I think we might not reconfigure IO timeout to 0 since nbd_config_put() is not called in such case. So could we still restore default timeout here. Or am I missing something? Thanks, Hou > > Josef
On 8/23/20 11:23 PM, Hou Pu wrote: > > > On 2020/8/21 9:57 PM, Josef Bacik wrote: >> On 8/21/20 3:21 AM, Hou Pu wrote: >>> >>> >>> On 2020/8/21 3:03 AM, Josef Bacik wrote: >>>> On 8/10/20 8:00 AM, Hou Pu wrote: >>>>> If we configured io timeout of nbd0 to 100s. Later after we >>>>> finished using it, we configured nbd0 again and set the io >>>>> timeout to 0. We expect it would timeout after 30 seconds >>>>> and keep retry. But in fact we could not change the timeout >>>>> when we set it to 0. the timeout is still the original 100s. >>>>> >>>>> So change the timeout to default 30s when we set it to zero. >>>>> It also behaves same as commit 2da22da57348 ("nbd: fix zero >>>>> cmd timeout handling v2"). >>>>> >>>>> It becomes more important if we were reconfigure a nbd device >>>>> and the io timeout it set to zero. Because it could take 30s >>>>> to detect the new socket and thus io could be completed more >>>>> quickly compared to 100s. >>>>> >>>>> Signed-off-by: Hou Pu <houpu@bytedance.com> >>>>> --- >>>>> drivers/block/nbd.c | 2 ++ >>>>> 1 file changed, 2 insertions(+) >>>>> >>>>> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c >>>>> index ce7e9f223b20..bc9dc1f847e1 100644 >>>>> --- a/drivers/block/nbd.c >>>>> +++ b/drivers/block/nbd.c >>>>> @@ -1360,6 +1360,8 @@ static void nbd_set_cmd_timeout(struct nbd_device >>>>> *nbd, u64 timeout) >>>>> nbd->tag_set.timeout = timeout * HZ; >>>>> if (timeout) >>>>> blk_queue_rq_timeout(nbd->disk->queue, timeout * HZ); >>>>> + else >>>>> + blk_queue_rq_timeout(nbd->disk->queue, 30 * HZ); >>>>> } >>>>> /* Must be called with config_lock held */ >>>>> >>>> >>>> What about the tag_set.timeout? Thanks, >>> >>> I think user space could set io timeout to 0, thus we set tag_set.timeout = 0 >>> here and also we should tell the block layer >>> to restore 30s timeout in case it is not. tag_set.timeout == 0 >>> imply 30s io timeout and retrying after timeout. >>> >>> (Sorry, I am not sure if I understand your question here. Could >>> you explain a little more if needed?) >>> >> >> I misunderstood what I was using the tagset timeout for. We don't want this >> here, if we're dropping a config for an nbd device and we want to reset it to >> defaults then we need to add this to nbd_config_put(). Thanks, > > AFAIK If we killed a nbd server, then restarted it and reconfigured > the nbd socket, I think we might not reconfigure IO timeout to 0 since > nbd_config_put() is not called in such case. So could we still > restore default timeout here. Or am I missing something? > If you kill the NBD server then the config is going to be dropped and need to be reconfigured, so nbd_config_put() will definitely be called. The only case it wouldn't be is if you are using the netlink interface, in which case the device is going to keep all of its original settings. Are you not seeing the final nbd_config_put() being done when you kill the nbd server? That seems like a bug if not, and that should be fixed, and then this timeout thing going in there will fix your issue. Thanks, Josef
On 2020/8/24 10:02 PM, Josef Bacik wrote: > On 8/23/20 11:23 PM, Hou Pu wrote: >> >> >> On 2020/8/21 9:57 PM, Josef Bacik wrote: >>> On 8/21/20 3:21 AM, Hou Pu wrote: >>>> >>>> >>>> On 2020/8/21 3:03 AM, Josef Bacik wrote: >>>>> On 8/10/20 8:00 AM, Hou Pu wrote: >>>>>> If we configured io timeout of nbd0 to 100s. Later after we >>>>>> finished using it, we configured nbd0 again and set the io >>>>>> timeout to 0. We expect it would timeout after 30 seconds >>>>>> and keep retry. But in fact we could not change the timeout >>>>>> when we set it to 0. the timeout is still the original 100s. >>>>>> >>>>>> So change the timeout to default 30s when we set it to zero. >>>>>> It also behaves same as commit 2da22da57348 ("nbd: fix zero >>>>>> cmd timeout handling v2"). >>>>>> >>>>>> It becomes more important if we were reconfigure a nbd device >>>>>> and the io timeout it set to zero. Because it could take 30s >>>>>> to detect the new socket and thus io could be completed more >>>>>> quickly compared to 100s. >>>>>> >>>>>> Signed-off-by: Hou Pu <houpu@bytedance.com> >>>>>> --- >>>>>> drivers/block/nbd.c | 2 ++ >>>>>> 1 file changed, 2 insertions(+) >>>>>> >>>>>> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c >>>>>> index ce7e9f223b20..bc9dc1f847e1 100644 >>>>>> --- a/drivers/block/nbd.c >>>>>> +++ b/drivers/block/nbd.c >>>>>> @@ -1360,6 +1360,8 @@ static void nbd_set_cmd_timeout(struct >>>>>> nbd_device *nbd, u64 timeout) >>>>>> nbd->tag_set.timeout = timeout * HZ; >>>>>> if (timeout) >>>>>> blk_queue_rq_timeout(nbd->disk->queue, timeout * HZ); >>>>>> + else >>>>>> + blk_queue_rq_timeout(nbd->disk->queue, 30 * HZ); >>>>>> } >>>>>> /* Must be called with config_lock held */ >>>>>> >>>>> >>>>> What about the tag_set.timeout? Thanks, >>>> >>>> I think user space could set io timeout to 0, thus we set >>>> tag_set.timeout = 0 here and also we should tell the block layer >>>> to restore 30s timeout in case it is not. tag_set.timeout == 0 >>>> imply 30s io timeout and retrying after timeout. >>>> >>>> (Sorry, I am not sure if I understand your question here. Could >>>> you explain a little more if needed?) >>>> >>> >>> I misunderstood what I was using the tagset timeout for. We don't >>> want this here, if we're dropping a config for an nbd device and we >>> want to reset it to defaults then we need to add this to >>> nbd_config_put(). Thanks, >> >> AFAIK If we killed a nbd server, then restarted it and reconfigured >> the nbd socket, I think we might not reconfigure IO timeout to 0 since >> nbd_config_put() is not called in such case. So could we still >> restore default timeout here. Or am I missing something? >> > > If you kill the NBD server then the config is going to be dropped and > need to be reconfigured, so nbd_config_put() will definitely be called. > The only case it wouldn't be is if you are using the netlink interface, > in which case the device is going to keep all of its original settings. > Are you not seeing the final nbd_config_put() being done when you kill > the nbd server? That seems like a bug if not, and that should be fixed, > and then this timeout thing going in there will fix your issue. Thanks, I was using the netlink interface. So I could use the reconnect feature to update the nbd server without impacting the user of nbd device. I did not see the final nbd_config_put() when I killed the nbd server. After I killed the nbd server, the recv_work() put 1 config_ref. Another ref count is still held by nbd_genl_connect(). I thought it was as expected. Beside in nbd_genl_reconfigure(), it is checked nbd->config_refs should not be zero by: if (!refcount_inc_not_zero(&nbd->config_refs)) { dev_err(nbd_to_dev(nbd), "not configured, cannot reconfigure\n"); nbd_put(nbd); return -EINVAL; } So AFAIK this behavior is as expected. > > Josef
On 8/25/20 4:27 AM, Hou Pu wrote: > > > On 2020/8/24 10:02 PM, Josef Bacik wrote: >> On 8/23/20 11:23 PM, Hou Pu wrote: >>> >>> >>> On 2020/8/21 9:57 PM, Josef Bacik wrote: >>>> On 8/21/20 3:21 AM, Hou Pu wrote: >>>>> >>>>> >>>>> On 2020/8/21 3:03 AM, Josef Bacik wrote: >>>>>> On 8/10/20 8:00 AM, Hou Pu wrote: >>>>>>> If we configured io timeout of nbd0 to 100s. Later after we >>>>>>> finished using it, we configured nbd0 again and set the io >>>>>>> timeout to 0. We expect it would timeout after 30 seconds >>>>>>> and keep retry. But in fact we could not change the timeout >>>>>>> when we set it to 0. the timeout is still the original 100s. >>>>>>> >>>>>>> So change the timeout to default 30s when we set it to zero. >>>>>>> It also behaves same as commit 2da22da57348 ("nbd: fix zero >>>>>>> cmd timeout handling v2"). >>>>>>> >>>>>>> It becomes more important if we were reconfigure a nbd device >>>>>>> and the io timeout it set to zero. Because it could take 30s >>>>>>> to detect the new socket and thus io could be completed more >>>>>>> quickly compared to 100s. >>>>>>> >>>>>>> Signed-off-by: Hou Pu <houpu@bytedance.com> >>>>>>> --- >>>>>>> drivers/block/nbd.c | 2 ++ >>>>>>> 1 file changed, 2 insertions(+) >>>>>>> >>>>>>> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c >>>>>>> index ce7e9f223b20..bc9dc1f847e1 100644 >>>>>>> --- a/drivers/block/nbd.c >>>>>>> +++ b/drivers/block/nbd.c >>>>>>> @@ -1360,6 +1360,8 @@ static void nbd_set_cmd_timeout(struct nbd_device >>>>>>> *nbd, u64 timeout) >>>>>>> nbd->tag_set.timeout = timeout * HZ; >>>>>>> if (timeout) >>>>>>> blk_queue_rq_timeout(nbd->disk->queue, timeout * HZ); >>>>>>> + else >>>>>>> + blk_queue_rq_timeout(nbd->disk->queue, 30 * HZ); >>>>>>> } >>>>>>> /* Must be called with config_lock held */ >>>>>>> >>>>>> >>>>>> What about the tag_set.timeout? Thanks, >>>>> >>>>> I think user space could set io timeout to 0, thus we set tag_set.timeout = >>>>> 0 here and also we should tell the block layer >>>>> to restore 30s timeout in case it is not. tag_set.timeout == 0 >>>>> imply 30s io timeout and retrying after timeout. >>>>> >>>>> (Sorry, I am not sure if I understand your question here. Could >>>>> you explain a little more if needed?) >>>>> >>>> >>>> I misunderstood what I was using the tagset timeout for. We don't want this >>>> here, if we're dropping a config for an nbd device and we want to reset it >>>> to defaults then we need to add this to nbd_config_put(). Thanks, >>> >>> AFAIK If we killed a nbd server, then restarted it and reconfigured >>> the nbd socket, I think we might not reconfigure IO timeout to 0 since >>> nbd_config_put() is not called in such case. So could we still >>> restore default timeout here. Or am I missing something? >>> >> >> If you kill the NBD server then the config is going to be dropped and need to >> be reconfigured, so nbd_config_put() will definitely be called. The only case >> it wouldn't be is if you are using the netlink interface, in which case the >> device is going to keep all of its original settings. Are you not seeing the >> final nbd_config_put() being done when you kill the nbd server? That seems >> like a bug if not, and that should be fixed, and then this timeout thing going >> in there will fix your issue. Thanks, > > I was using the netlink interface. So I could use the reconnect > feature to update the nbd server without impacting the user of > nbd device. > > I did not see the final nbd_config_put() when I killed the nbd server. > After I killed the nbd server, the recv_work() put 1 config_ref. > Another ref count is still held by nbd_genl_connect(). I thought it > was as expected. > > Beside in nbd_genl_reconfigure(), it is checked nbd->config_refs should > not be zero by: > if (!refcount_inc_not_zero(&nbd->config_refs)) { > dev_err(nbd_to_dev(nbd), > "not configured, cannot reconfigure\n"); > nbd_put(nbd); > return -EINVAL; > } > So AFAIK this behavior is as expected. Ahh ok I see what you're getting at. Ok I agree, you can add Reviewed-by: Josef Bacik <josef@toxicpanda.com> Thanks, Josef
On 2020/8/26 1:29 上午, Josef Bacik wrote: > On 8/25/20 4:27 AM, Hou Pu wrote: >> >> >> On 2020/8/24 10:02 PM, Josef Bacik wrote: >>> On 8/23/20 11:23 PM, Hou Pu wrote: >>>> >>>> >>>> On 2020/8/21 9:57 PM, Josef Bacik wrote: >>>>> On 8/21/20 3:21 AM, Hou Pu wrote: >>>>>> >>>>>> >>>>>> On 2020/8/21 3:03 AM, Josef Bacik wrote: >>>>>>> On 8/10/20 8:00 AM, Hou Pu wrote: >>>>>>>> If we configured io timeout of nbd0 to 100s. Later after we >>>>>>>> finished using it, we configured nbd0 again and set the io >>>>>>>> timeout to 0. We expect it would timeout after 30 seconds >>>>>>>> and keep retry. But in fact we could not change the timeout >>>>>>>> when we set it to 0. the timeout is still the original 100s. >>>>>>>> >>>>>>>> So change the timeout to default 30s when we set it to zero. >>>>>>>> It also behaves same as commit 2da22da57348 ("nbd: fix zero >>>>>>>> cmd timeout handling v2"). >>>>>>>> >>>>>>>> It becomes more important if we were reconfigure a nbd device >>>>>>>> and the io timeout it set to zero. Because it could take 30s >>>>>>>> to detect the new socket and thus io could be completed more >>>>>>>> quickly compared to 100s. >>>>>>>> >>>>>>>> Signed-off-by: Hou Pu <houpu@bytedance.com> >>>>>>>> --- >>>>>>>> drivers/block/nbd.c | 2 ++ >>>>>>>> 1 file changed, 2 insertions(+) >>>>>>>> >>>>>>>> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c >>>>>>>> index ce7e9f223b20..bc9dc1f847e1 100644 >>>>>>>> --- a/drivers/block/nbd.c >>>>>>>> +++ b/drivers/block/nbd.c >>>>>>>> @@ -1360,6 +1360,8 @@ static void nbd_set_cmd_timeout(struct >>>>>>>> nbd_device *nbd, u64 timeout) >>>>>>>> nbd->tag_set.timeout = timeout * HZ; >>>>>>>> if (timeout) >>>>>>>> blk_queue_rq_timeout(nbd->disk->queue, timeout * HZ); >>>>>>>> + else >>>>>>>> + blk_queue_rq_timeout(nbd->disk->queue, 30 * HZ); >>>>>>>> } >>>>>>>> /* Must be called with config_lock held */ >>>>>>>> >>>>>>> >>>>>>> What about the tag_set.timeout? Thanks, >>>>>> >>>>>> I think user space could set io timeout to 0, thus we set >>>>>> tag_set.timeout = 0 here and also we should tell the block layer >>>>>> to restore 30s timeout in case it is not. tag_set.timeout == 0 >>>>>> imply 30s io timeout and retrying after timeout. >>>>>> >>>>>> (Sorry, I am not sure if I understand your question here. Could >>>>>> you explain a little more if needed?) >>>>>> >>>>> >>>>> I misunderstood what I was using the tagset timeout for. We don't >>>>> want this here, if we're dropping a config for an nbd device and we >>>>> want to reset it to defaults then we need to add this to >>>>> nbd_config_put(). Thanks, >>>> >>>> AFAIK If we killed a nbd server, then restarted it and reconfigured >>>> the nbd socket, I think we might not reconfigure IO timeout to 0 since >>>> nbd_config_put() is not called in such case. So could we still >>>> restore default timeout here. Or am I missing something? >>>> >>> >>> If you kill the NBD server then the config is going to be dropped and >>> need to be reconfigured, so nbd_config_put() will definitely be >>> called. The only case it wouldn't be is if you are using the netlink >>> interface, in which case the device is going to keep all of its >>> original settings. Are you not seeing the final nbd_config_put() >>> being done when you kill the nbd server? That seems like a bug if >>> not, and that should be fixed, and then this timeout thing going in >>> there will fix your issue. Thanks, >> >> I was using the netlink interface. So I could use the reconnect >> feature to update the nbd server without impacting the user of >> nbd device. >> >> I did not see the final nbd_config_put() when I killed the nbd server. >> After I killed the nbd server, the recv_work() put 1 config_ref. >> Another ref count is still held by nbd_genl_connect(). I thought it >> was as expected. >> >> Beside in nbd_genl_reconfigure(), it is checked nbd->config_refs should >> not be zero by: >> if (!refcount_inc_not_zero(&nbd->config_refs)) { >> dev_err(nbd_to_dev(nbd), >> "not configured, cannot reconfigure\n"); >> nbd_put(nbd); >> return -EINVAL; >> } >> So AFAIK this behavior is as expected. > > Ahh ok I see what you're getting at. Ok I agree, you can add > > Reviewed-by: Josef Bacik <josef@toxicpanda.com> Thanks for your review, Hou > > Thanks, > > Josef
On 8/25/20 7:51 PM, Hou Pu wrote: > > > On 2020/8/26 1:29 上午, Josef Bacik wrote: >> On 8/25/20 4:27 AM, Hou Pu wrote: >>> >>> >>> On 2020/8/24 10:02 PM, Josef Bacik wrote: >>>> On 8/23/20 11:23 PM, Hou Pu wrote: >>>>> >>>>> >>>>> On 2020/8/21 9:57 PM, Josef Bacik wrote: >>>>>> On 8/21/20 3:21 AM, Hou Pu wrote: >>>>>>> >>>>>>> >>>>>>> On 2020/8/21 3:03 AM, Josef Bacik wrote: >>>>>>>> On 8/10/20 8:00 AM, Hou Pu wrote: >>>>>>>>> If we configured io timeout of nbd0 to 100s. Later after we >>>>>>>>> finished using it, we configured nbd0 again and set the io >>>>>>>>> timeout to 0. We expect it would timeout after 30 seconds >>>>>>>>> and keep retry. But in fact we could not change the timeout >>>>>>>>> when we set it to 0. the timeout is still the original 100s. >>>>>>>>> >>>>>>>>> So change the timeout to default 30s when we set it to zero. >>>>>>>>> It also behaves same as commit 2da22da57348 ("nbd: fix zero >>>>>>>>> cmd timeout handling v2"). >>>>>>>>> >>>>>>>>> It becomes more important if we were reconfigure a nbd device >>>>>>>>> and the io timeout it set to zero. Because it could take 30s >>>>>>>>> to detect the new socket and thus io could be completed more >>>>>>>>> quickly compared to 100s. >>>>>>>>> >>>>>>>>> Signed-off-by: Hou Pu <houpu@bytedance.com> >>>>>>>>> --- >>>>>>>>> drivers/block/nbd.c | 2 ++ >>>>>>>>> 1 file changed, 2 insertions(+) >>>>>>>>> >>>>>>>>> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c >>>>>>>>> index ce7e9f223b20..bc9dc1f847e1 100644 >>>>>>>>> --- a/drivers/block/nbd.c >>>>>>>>> +++ b/drivers/block/nbd.c >>>>>>>>> @@ -1360,6 +1360,8 @@ static void nbd_set_cmd_timeout(struct >>>>>>>>> nbd_device *nbd, u64 timeout) >>>>>>>>> nbd->tag_set.timeout = timeout * HZ; >>>>>>>>> if (timeout) >>>>>>>>> blk_queue_rq_timeout(nbd->disk->queue, timeout * HZ); >>>>>>>>> + else >>>>>>>>> + blk_queue_rq_timeout(nbd->disk->queue, 30 * HZ); >>>>>>>>> } >>>>>>>>> /* Must be called with config_lock held */ >>>>>>>>> >>>>>>>> >>>>>>>> What about the tag_set.timeout? Thanks, >>>>>>> >>>>>>> I think user space could set io timeout to 0, thus we set >>>>>>> tag_set.timeout = 0 here and also we should tell the block layer >>>>>>> to restore 30s timeout in case it is not. tag_set.timeout == 0 >>>>>>> imply 30s io timeout and retrying after timeout. >>>>>>> >>>>>>> (Sorry, I am not sure if I understand your question here. Could >>>>>>> you explain a little more if needed?) >>>>>>> >>>>>> >>>>>> I misunderstood what I was using the tagset timeout for. We don't >>>>>> want this here, if we're dropping a config for an nbd device and we >>>>>> want to reset it to defaults then we need to add this to >>>>>> nbd_config_put(). Thanks, >>>>> >>>>> AFAIK If we killed a nbd server, then restarted it and reconfigured >>>>> the nbd socket, I think we might not reconfigure IO timeout to 0 since >>>>> nbd_config_put() is not called in such case. So could we still >>>>> restore default timeout here. Or am I missing something? >>>>> >>>> >>>> If you kill the NBD server then the config is going to be dropped and >>>> need to be reconfigured, so nbd_config_put() will definitely be >>>> called. The only case it wouldn't be is if you are using the netlink >>>> interface, in which case the device is going to keep all of its >>>> original settings. Are you not seeing the final nbd_config_put() >>>> being done when you kill the nbd server? That seems like a bug if >>>> not, and that should be fixed, and then this timeout thing going in >>>> there will fix your issue. Thanks, >>> >>> I was using the netlink interface. So I could use the reconnect >>> feature to update the nbd server without impacting the user of >>> nbd device. >>> >>> I did not see the final nbd_config_put() when I killed the nbd server. >>> After I killed the nbd server, the recv_work() put 1 config_ref. >>> Another ref count is still held by nbd_genl_connect(). I thought it >>> was as expected. >>> >>> Beside in nbd_genl_reconfigure(), it is checked nbd->config_refs should >>> not be zero by: >>> if (!refcount_inc_not_zero(&nbd->config_refs)) { >>> dev_err(nbd_to_dev(nbd), >>> "not configured, cannot reconfigure\n"); >>> nbd_put(nbd); >>> return -EINVAL; >>> } >>> So AFAIK this behavior is as expected. >> >> Ahh ok I see what you're getting at. Ok I agree, you can add >> >> Reviewed-by: Josef Bacik <josef@toxicpanda.com> > > Thanks for your review, Applied, thanks.
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c index ce7e9f223b20..bc9dc1f847e1 100644 --- a/drivers/block/nbd.c +++ b/drivers/block/nbd.c @@ -1360,6 +1360,8 @@ static void nbd_set_cmd_timeout(struct nbd_device *nbd, u64 timeout) nbd->tag_set.timeout = timeout * HZ; if (timeout) blk_queue_rq_timeout(nbd->disk->queue, timeout * HZ); + else + blk_queue_rq_timeout(nbd->disk->queue, 30 * HZ); } /* Must be called with config_lock held */
If we configured io timeout of nbd0 to 100s. Later after we finished using it, we configured nbd0 again and set the io timeout to 0. We expect it would timeout after 30 seconds and keep retry. But in fact we could not change the timeout when we set it to 0. the timeout is still the original 100s. So change the timeout to default 30s when we set it to zero. It also behaves same as commit 2da22da57348 ("nbd: fix zero cmd timeout handling v2"). It becomes more important if we were reconfigure a nbd device and the io timeout it set to zero. Because it could take 30s to detect the new socket and thus io could be completed more quickly compared to 100s. Signed-off-by: Hou Pu <houpu@bytedance.com> --- drivers/block/nbd.c | 2 ++ 1 file changed, 2 insertions(+)