diff mbox series

[1/3] nbd: fix connection timed out error after reconnecting to server

Message ID 1558691036-16281-1-git-send-email-yotta.liu@ucloud.cn (mailing list archive)
State New, archived
Headers show
Series [1/3] nbd: fix connection timed out error after reconnecting to server | expand

Commit Message

Yao Liu May 24, 2019, 9:43 a.m. UTC
Some I/O requests that have been sent succussfully but have not yet been
replied won't be resubmitted after reconnecting because of server restart,
so we add a list to track them.

Signed-off-by: Yao Liu <yotta.liu@ucloud.cn>
---
 drivers/block/nbd.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 46 insertions(+)

Comments

Josef Bacik May 24, 2019, 1:07 p.m. UTC | #1
On Fri, May 24, 2019 at 05:43:54PM +0800, Yao Liu wrote:
> Some I/O requests that have been sent succussfully but have not yet been
> replied won't be resubmitted after reconnecting because of server restart,
> so we add a list to track them.
> 
> Signed-off-by: Yao Liu <yotta.liu@ucloud.cn>

Nack, this is what the timeout stuff is supposed to handle.  The commands will
timeout and we'll resubmit them if we have alive sockets.  Thanks,

Josef
Yao Liu May 27, 2019, 6:07 p.m. UTC | #2
On Fri, May 24, 2019 at 09:07:42AM -0400, Josef Bacik wrote:
> On Fri, May 24, 2019 at 05:43:54PM +0800, Yao Liu wrote:
> > Some I/O requests that have been sent succussfully but have not yet been
> > replied won't be resubmitted after reconnecting because of server restart,
> > so we add a list to track them.
> > 
> > Signed-off-by: Yao Liu <yotta.liu@ucloud.cn>
> 
> Nack, this is what the timeout stuff is supposed to handle.  The commands will
> timeout and we'll resubmit them if we have alive sockets.  Thanks,
> 
> Josef
> 

On the one hand, if num_connections == 1 and the only sock has dead,
then we do nbd_genl_reconfigure to reconnect within dead_conn_timeout,
nbd_xmit_timeout will not resubmit commands that have been sent
succussfully but have not yet been replied. The log is as follows:
 
[270551.108746] block nbd0: Receive control failed (result -104)
[270551.108747] block nbd0: Send control failed (result -32)
[270551.108750] block nbd0: Request send failed, requeueing
[270551.116207] block nbd0: Attempted send on invalid socket
[270556.119584] block nbd0: reconnected socket
[270581.161751] block nbd0: Connection timed out
[270581.165038] block nbd0: shutting down sockets
[270581.165041] print_req_error: I/O error, dev nbd0, sector 5123224 flags 8801
[270581.165149] print_req_error: I/O error, dev nbd0, sector 5123232 flags 8801
[270581.165580] block nbd0: Connection timed out
[270581.165587] print_req_error: I/O error, dev nbd0, sector 844680 flags 8801
[270581.166184] print_req_error: I/O error, dev nbd0, sector 5123240 flags 8801
[270581.166554] block nbd0: Connection timed out
[270581.166576] print_req_error: I/O error, dev nbd0, sector 844688 flags 8801
[270581.167124] print_req_error: I/O error, dev nbd0, sector 5123248 flags 8801
[270581.167590] block nbd0: Connection timed out
[270581.167597] print_req_error: I/O error, dev nbd0, sector 844696 flags 8801
[270581.168021] print_req_error: I/O error, dev nbd0, sector 5123256 flags 8801
[270581.168487] block nbd0: Connection timed out
[270581.168493] print_req_error: I/O error, dev nbd0, sector 844704 flags 8801
[270581.170183] print_req_error: I/O error, dev nbd0, sector 5123264 flags 8801
[270581.170540] block nbd0: Connection timed out
[270581.173333] block nbd0: Connection timed out
[270581.173728] block nbd0: Connection timed out
[270581.174135] block nbd0: Connection timed out
 
On the other hand, if we wait nbd_xmit_timeout to handle resubmission,
the I/O requests will have a big delay. For example, if timeout time is 30s,
and from sock dead to nbd_genl_reconfigure returned OK we only spend
2s, the I/O requests will still be handled by nbd_xmit_timeout after 30s.
Josef Bacik May 28, 2019, 4:57 p.m. UTC | #3
On Tue, May 28, 2019 at 02:07:43AM +0800, Yao Liu wrote:
> On Fri, May 24, 2019 at 09:07:42AM -0400, Josef Bacik wrote:
> > On Fri, May 24, 2019 at 05:43:54PM +0800, Yao Liu wrote:
> > > Some I/O requests that have been sent succussfully but have not yet been
> > > replied won't be resubmitted after reconnecting because of server restart,
> > > so we add a list to track them.
> > > 
> > > Signed-off-by: Yao Liu <yotta.liu@ucloud.cn>
> > 
> > Nack, this is what the timeout stuff is supposed to handle.  The commands will
> > timeout and we'll resubmit them if we have alive sockets.  Thanks,
> > 
> > Josef
> > 
> 
> On the one hand, if num_connections == 1 and the only sock has dead,
> then we do nbd_genl_reconfigure to reconnect within dead_conn_timeout,
> nbd_xmit_timeout will not resubmit commands that have been sent
> succussfully but have not yet been replied. The log is as follows:
>  
> [270551.108746] block nbd0: Receive control failed (result -104)
> [270551.108747] block nbd0: Send control failed (result -32)
> [270551.108750] block nbd0: Request send failed, requeueing
> [270551.116207] block nbd0: Attempted send on invalid socket
> [270556.119584] block nbd0: reconnected socket
> [270581.161751] block nbd0: Connection timed out
> [270581.165038] block nbd0: shutting down sockets
> [270581.165041] print_req_error: I/O error, dev nbd0, sector 5123224 flags 8801
> [270581.165149] print_req_error: I/O error, dev nbd0, sector 5123232 flags 8801
> [270581.165580] block nbd0: Connection timed out
> [270581.165587] print_req_error: I/O error, dev nbd0, sector 844680 flags 8801
> [270581.166184] print_req_error: I/O error, dev nbd0, sector 5123240 flags 8801
> [270581.166554] block nbd0: Connection timed out
> [270581.166576] print_req_error: I/O error, dev nbd0, sector 844688 flags 8801
> [270581.167124] print_req_error: I/O error, dev nbd0, sector 5123248 flags 8801
> [270581.167590] block nbd0: Connection timed out
> [270581.167597] print_req_error: I/O error, dev nbd0, sector 844696 flags 8801
> [270581.168021] print_req_error: I/O error, dev nbd0, sector 5123256 flags 8801
> [270581.168487] block nbd0: Connection timed out
> [270581.168493] print_req_error: I/O error, dev nbd0, sector 844704 flags 8801
> [270581.170183] print_req_error: I/O error, dev nbd0, sector 5123264 flags 8801
> [270581.170540] block nbd0: Connection timed out
> [270581.173333] block nbd0: Connection timed out
> [270581.173728] block nbd0: Connection timed out
> [270581.174135] block nbd0: Connection timed out
>  
> On the other hand, if we wait nbd_xmit_timeout to handle resubmission,
> the I/O requests will have a big delay. For example, if timeout time is 30s,
> and from sock dead to nbd_genl_reconfigure returned OK we only spend
> 2s, the I/O requests will still be handled by nbd_xmit_timeout after 30s.

We have to wait for the full timeout anyway to know that the socket went down,
so it'll be re-submitted right away and then we'll wait on the new connection.

Now we could definitely have requests that were submitted well after the first
thing that failed, so their timeout would be longer than simply retrying them,
but we have no idea of knowing which ones timed out and which ones didn't.  This
way lies pain, because we have to matchup tags with handles.  This is why we
rely on the generic timeout infrastructure, so everything is handled correctly
without ending up with duplicate submissions/replies.  Thanks,

Josef
Yao Liu May 28, 2019, 7:04 p.m. UTC | #4
On Tue, May 28, 2019 at 12:57:59PM -0400, Josef Bacik wrote:
> On Tue, May 28, 2019 at 02:07:43AM +0800, Yao Liu wrote:
> > On Fri, May 24, 2019 at 09:07:42AM -0400, Josef Bacik wrote:
> > > On Fri, May 24, 2019 at 05:43:54PM +0800, Yao Liu wrote:
> > > > Some I/O requests that have been sent succussfully but have not yet been
> > > > replied won't be resubmitted after reconnecting because of server restart,
> > > > so we add a list to track them.
> > > > 
> > > > Signed-off-by: Yao Liu <yotta.liu@ucloud.cn>
> > > 
> > > Nack, this is what the timeout stuff is supposed to handle.  The commands will
> > > timeout and we'll resubmit them if we have alive sockets.  Thanks,
> > > 
> > > Josef
> > > 
> > 
> > On the one hand, if num_connections == 1 and the only sock has dead,
> > then we do nbd_genl_reconfigure to reconnect within dead_conn_timeout,
> > nbd_xmit_timeout will not resubmit commands that have been sent
> > succussfully but have not yet been replied. The log is as follows:
> >  
> > [270551.108746] block nbd0: Receive control failed (result -104)
> > [270551.108747] block nbd0: Send control failed (result -32)
> > [270551.108750] block nbd0: Request send failed, requeueing
> > [270551.116207] block nbd0: Attempted send on invalid socket
> > [270556.119584] block nbd0: reconnected socket
> > [270581.161751] block nbd0: Connection timed out
> > [270581.165038] block nbd0: shutting down sockets
> > [270581.165041] print_req_error: I/O error, dev nbd0, sector 5123224 flags 8801
> > [270581.165149] print_req_error: I/O error, dev nbd0, sector 5123232 flags 8801
> > [270581.165580] block nbd0: Connection timed out
> > [270581.165587] print_req_error: I/O error, dev nbd0, sector 844680 flags 8801
> > [270581.166184] print_req_error: I/O error, dev nbd0, sector 5123240 flags 8801
> > [270581.166554] block nbd0: Connection timed out
> > [270581.166576] print_req_error: I/O error, dev nbd0, sector 844688 flags 8801
> > [270581.167124] print_req_error: I/O error, dev nbd0, sector 5123248 flags 8801
> > [270581.167590] block nbd0: Connection timed out
> > [270581.167597] print_req_error: I/O error, dev nbd0, sector 844696 flags 8801
> > [270581.168021] print_req_error: I/O error, dev nbd0, sector 5123256 flags 8801
> > [270581.168487] block nbd0: Connection timed out
> > [270581.168493] print_req_error: I/O error, dev nbd0, sector 844704 flags 8801
> > [270581.170183] print_req_error: I/O error, dev nbd0, sector 5123264 flags 8801
> > [270581.170540] block nbd0: Connection timed out
> > [270581.173333] block nbd0: Connection timed out
> > [270581.173728] block nbd0: Connection timed out
> > [270581.174135] block nbd0: Connection timed out
> >  
> > On the other hand, if we wait nbd_xmit_timeout to handle resubmission,
> > the I/O requests will have a big delay. For example, if timeout time is 30s,
> > and from sock dead to nbd_genl_reconfigure returned OK we only spend
> > 2s, the I/O requests will still be handled by nbd_xmit_timeout after 30s.
> 
> We have to wait for the full timeout anyway to know that the socket went down,
> so it'll be re-submitted right away and then we'll wait on the new connection.
> 
> Now we could definitely have requests that were submitted well after the first
> thing that failed, so their timeout would be longer than simply retrying them,
> but we have no idea of knowing which ones timed out and which ones didn't.  This
> way lies pain, because we have to matchup tags with handles.  This is why we
> rely on the generic timeout infrastructure, so everything is handled correctly
> without ending up with duplicate submissions/replies.  Thanks,
> 
> Josef
> 

But as I mentioned before, if num_connections == 1, nbd_xmit_timeout won't re-submit
commands and I/O error will occur. Should we change the condition
		if (config->num_connections > 1)
to
		if (config->num_connections >= 1)
?
Josef Bacik May 29, 2019, 1:49 p.m. UTC | #5
On Wed, May 29, 2019 at 03:04:46AM +0800, Yao Liu wrote:
> On Tue, May 28, 2019 at 12:57:59PM -0400, Josef Bacik wrote:
> > On Tue, May 28, 2019 at 02:07:43AM +0800, Yao Liu wrote:
> > > On Fri, May 24, 2019 at 09:07:42AM -0400, Josef Bacik wrote:
> > > > On Fri, May 24, 2019 at 05:43:54PM +0800, Yao Liu wrote:
> > > > > Some I/O requests that have been sent succussfully but have not yet been
> > > > > replied won't be resubmitted after reconnecting because of server restart,
> > > > > so we add a list to track them.
> > > > > 
> > > > > Signed-off-by: Yao Liu <yotta.liu@ucloud.cn>
> > > > 
> > > > Nack, this is what the timeout stuff is supposed to handle.  The commands will
> > > > timeout and we'll resubmit them if we have alive sockets.  Thanks,
> > > > 
> > > > Josef
> > > > 
> > > 
> > > On the one hand, if num_connections == 1 and the only sock has dead,
> > > then we do nbd_genl_reconfigure to reconnect within dead_conn_timeout,
> > > nbd_xmit_timeout will not resubmit commands that have been sent
> > > succussfully but have not yet been replied. The log is as follows:
> > >  
> > > [270551.108746] block nbd0: Receive control failed (result -104)
> > > [270551.108747] block nbd0: Send control failed (result -32)
> > > [270551.108750] block nbd0: Request send failed, requeueing
> > > [270551.116207] block nbd0: Attempted send on invalid socket
> > > [270556.119584] block nbd0: reconnected socket
> > > [270581.161751] block nbd0: Connection timed out
> > > [270581.165038] block nbd0: shutting down sockets
> > > [270581.165041] print_req_error: I/O error, dev nbd0, sector 5123224 flags 8801
> > > [270581.165149] print_req_error: I/O error, dev nbd0, sector 5123232 flags 8801
> > > [270581.165580] block nbd0: Connection timed out
> > > [270581.165587] print_req_error: I/O error, dev nbd0, sector 844680 flags 8801
> > > [270581.166184] print_req_error: I/O error, dev nbd0, sector 5123240 flags 8801
> > > [270581.166554] block nbd0: Connection timed out
> > > [270581.166576] print_req_error: I/O error, dev nbd0, sector 844688 flags 8801
> > > [270581.167124] print_req_error: I/O error, dev nbd0, sector 5123248 flags 8801
> > > [270581.167590] block nbd0: Connection timed out
> > > [270581.167597] print_req_error: I/O error, dev nbd0, sector 844696 flags 8801
> > > [270581.168021] print_req_error: I/O error, dev nbd0, sector 5123256 flags 8801
> > > [270581.168487] block nbd0: Connection timed out
> > > [270581.168493] print_req_error: I/O error, dev nbd0, sector 844704 flags 8801
> > > [270581.170183] print_req_error: I/O error, dev nbd0, sector 5123264 flags 8801
> > > [270581.170540] block nbd0: Connection timed out
> > > [270581.173333] block nbd0: Connection timed out
> > > [270581.173728] block nbd0: Connection timed out
> > > [270581.174135] block nbd0: Connection timed out
> > >  
> > > On the other hand, if we wait nbd_xmit_timeout to handle resubmission,
> > > the I/O requests will have a big delay. For example, if timeout time is 30s,
> > > and from sock dead to nbd_genl_reconfigure returned OK we only spend
> > > 2s, the I/O requests will still be handled by nbd_xmit_timeout after 30s.
> > 
> > We have to wait for the full timeout anyway to know that the socket went down,
> > so it'll be re-submitted right away and then we'll wait on the new connection.
> > 
> > Now we could definitely have requests that were submitted well after the first
> > thing that failed, so their timeout would be longer than simply retrying them,
> > but we have no idea of knowing which ones timed out and which ones didn't.  This
> > way lies pain, because we have to matchup tags with handles.  This is why we
> > rely on the generic timeout infrastructure, so everything is handled correctly
> > without ending up with duplicate submissions/replies.  Thanks,
> > 
> > Josef
> > 
> 
> But as I mentioned before, if num_connections == 1, nbd_xmit_timeout won't re-submit
> commands and I/O error will occur. Should we change the condition
> 		if (config->num_connections > 1)
> to
> 		if (config->num_connections >= 1)
> ?

Only if you don't have the patch 3 in place though right?  If you fix patch 3 to
allow requeuing if you have a dead connection timer set then you can requeue and
everything is a-ok.  Thanks,

Josef
diff mbox series

Patch

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 053958a..ca69d6e 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -113,6 +113,8 @@  struct nbd_device {
 	struct list_head list;
 	struct task_struct *task_recv;
 	struct task_struct *task_setup;
+	struct mutex outstanding_lock;
+	struct list_head outstanding_queue;
 };
 
 #define NBD_CMD_REQUEUED	1
@@ -125,6 +127,7 @@  struct nbd_cmd {
 	blk_status_t status;
 	unsigned long flags;
 	u32 cmd_cookie;
+	struct list_head outstanding_entry;
 };
 
 #if IS_ENABLED(CONFIG_DEBUG_FS)
@@ -619,6 +622,24 @@  static int nbd_send_cmd(struct nbd_device *nbd, struct nbd_cmd *cmd, int index)
 	return 0;
 }
 
+static struct nbd_cmd *nbd_get_cmd(struct nbd_device *nbd,
+					struct nbd_cmd *xcmd)
+{
+	struct nbd_cmd *cmd, *tmp;
+
+	mutex_lock(&nbd->outstanding_lock);
+	list_for_each_entry_safe(cmd, tmp, &nbd->outstanding_queue, outstanding_entry) {
+		if (cmd != xcmd)
+			continue;
+		list_del_init(&cmd->outstanding_entry);
+		mutex_unlock(&nbd->outstanding_lock);
+		return cmd;
+	}
+	mutex_unlock(&nbd->outstanding_lock);
+
+	return ERR_PTR(-ENOENT);
+}
+
 /* NULL returned = something went wrong, inform userspace */
 static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
 {
@@ -714,12 +735,30 @@  static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
 				req, bvec.bv_len);
 		}
 	}
+	cmd = nbd_get_cmd(nbd, cmd);
+	if (IS_ERR(cmd)) {
+		dev_err(disk_to_dev(nbd->disk), "Unexpected reply (%d) %p which not in outstanding queue\n",
+			tag, req);
+		ret = -ENOENT;
+	}
 out:
 	trace_nbd_payload_received(req, handle);
 	mutex_unlock(&cmd->lock);
 	return ret ? ERR_PTR(ret) : cmd;
 }
 
+static void nbd_requeue_outstanding(struct nbd_device *nbd)
+{
+	struct nbd_cmd *cmd, *tmp;
+
+	mutex_lock(&nbd->outstanding_lock);
+	list_for_each_entry_safe(cmd, tmp, &nbd->outstanding_queue, outstanding_entry) {
+		nbd_requeue_cmd(cmd);
+		list_del_init(&cmd->outstanding_entry);
+	}
+	mutex_unlock(&nbd->outstanding_lock);
+}
+
 static void recv_work(struct work_struct *work)
 {
 	struct recv_thread_args *args = container_of(work,
@@ -742,6 +781,7 @@  static void recv_work(struct work_struct *work)
 
 		blk_mq_complete_request(blk_mq_rq_from_pdu(cmd));
 	}
+	nbd_requeue_outstanding(nbd);
 	atomic_dec(&config->recv_threads);
 	wake_up(&config->recv_wq);
 	nbd_config_put(nbd);
@@ -892,6 +932,10 @@  static int nbd_handle_cmd(struct nbd_cmd *cmd, int index)
 		nbd_mark_nsock_dead(nbd, nsock, 1);
 		nbd_requeue_cmd(cmd);
 		ret = 0;
+	} else if (ret == 0) {
+		mutex_lock(&nbd->outstanding_lock);
+		list_add_tail(&cmd->outstanding_entry, &nbd->outstanding_queue);
+		mutex_unlock(&nbd->outstanding_lock);
 	}
 out:
 	mutex_unlock(&nsock->tx_lock);
@@ -1615,6 +1659,8 @@  static int nbd_dev_add(int index)
 	refcount_set(&nbd->config_refs, 0);
 	refcount_set(&nbd->refs, 1);
 	INIT_LIST_HEAD(&nbd->list);
+	mutex_init(&nbd->outstanding_lock);
+	INIT_LIST_HEAD(&nbd->outstanding_queue);
 	disk->major = NBD_MAJOR;
 	disk->first_minor = index << part_shift;
 	disk->fops = &nbd_fops;