diff mbox series

[for-5.0] xen-block: Fix double qlist remove

Message ID 20200402130819.1216125-1-anthony.perard@citrix.com (mailing list archive)
State New, archived
Headers show
Series [for-5.0] xen-block: Fix double qlist remove | expand

Commit Message

Anthony PERARD April 2, 2020, 1:08 p.m. UTC
Commit a31ca6801c02 ("qemu/queue.h: clear linked list pointers on
remove") revealed that a request was removed twice from a list, once
in xen_block_finish_request() and a second time in
xen_block_release_request() when both function are called from
xen_block_complete_aio(). But also, the `requests_inflight' counter is
decreased twice, and thus became negative.

This is a bug that was introduced in bfd0d6366043, where a `finished'
list was removed.

This patch simply re-add the `finish' parameter of
xen_block_release_request() so that we can distinguish when we need to
remove a request from the inflight list and when not.

Fixes: bfd0d6366043 ("xen-block: improve response latency")
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
---
 hw/block/dataplane/xen-block.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

Comments

Paul Durrant April 2, 2020, 2:27 p.m. UTC | #1
> -----Original Message-----
> From: Anthony PERARD <anthony.perard@citrix.com>
> Sent: 02 April 2020 14:08
> To: qemu-devel@nongnu.org
> Cc: qemu-stable@nongnu.org; Anthony PERARD <anthony.perard@citrix.com>; Stefano Stabellini
> <sstabellini@kernel.org>; Paul Durrant <paul@xen.org>; Stefan Hajnoczi <stefanha@redhat.com>; Kevin
> Wolf <kwolf@redhat.com>; Max Reitz <mreitz@redhat.com>; xen-devel@lists.xenproject.org; qemu-
> block@nongnu.org
> Subject: [PATCH for-5.0] xen-block: Fix double qlist remove
> 
> Commit a31ca6801c02 ("qemu/queue.h: clear linked list pointers on
> remove") revealed that a request was removed twice from a list, once
> in xen_block_finish_request() and a second time in
> xen_block_release_request() when both function are called from
> xen_block_complete_aio(). But also, the `requests_inflight' counter is
> decreased twice, and thus became negative.
> 
> This is a bug that was introduced in bfd0d6366043, where a `finished'
> list was removed.
> 
> This patch simply re-add the `finish' parameter of
> xen_block_release_request() so that we can distinguish when we need to
> remove a request from the inflight list and when not.
> 
> Fixes: bfd0d6366043 ("xen-block: improve response latency")
> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>

It looks to me like it would just be more straightforward to simply drop the QLIST_REMOVE and requests_inflight-- from
xen_block_release_request() and simply insist that xen_block_finish_request() is called in all cases (which I think means adding one
extra call to it in xen_block_handle_requests()).

  Paul

> ---
>  hw/block/dataplane/xen-block.c | 14 +++++++++-----
>  1 file changed, 9 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/block/dataplane/xen-block.c b/hw/block/dataplane/xen-block.c
> index 288a87a814ad..6cc089fc561f 100644
> --- a/hw/block/dataplane/xen-block.c
> +++ b/hw/block/dataplane/xen-block.c
> @@ -123,15 +123,19 @@ static void xen_block_finish_request(XenBlockRequest *request)
>      dataplane->requests_inflight--;
>  }
> 
> -static void xen_block_release_request(XenBlockRequest *request)
> +static void xen_block_release_request(XenBlockRequest *request, bool finish)
>  {
>      XenBlockDataPlane *dataplane = request->dataplane;
> 
> -    QLIST_REMOVE(request, list);
> +    if (!finish) {
> +        QLIST_REMOVE(request, list);
> +    }
>      reset_request(request);
>      request->dataplane = dataplane;
>      QLIST_INSERT_HEAD(&dataplane->freelist, request, list);
> -    dataplane->requests_inflight--;
> +    if (!finish) {
> +        dataplane->requests_inflight--;
> +    }
>  }
> 
>  /*
> @@ -316,7 +320,7 @@ static void xen_block_complete_aio(void *opaque, int ret)
>              error_report_err(local_err);
>          }
>      }
> -    xen_block_release_request(request);
> +    xen_block_release_request(request, true);
> 
>      if (dataplane->more_work) {
>          qemu_bh_schedule(dataplane->bh);
> @@ -585,7 +589,7 @@ static bool xen_block_handle_requests(XenBlockDataPlane *dataplane)
>                      error_report_err(local_err);
>                  }
>              }
> -            xen_block_release_request(request);
> +            xen_block_release_request(request, false);
>              continue;
>          }
> 
> --
> Anthony PERARD
Anthony PERARD April 6, 2020, 10:59 a.m. UTC | #2
On Thu, Apr 02, 2020 at 03:27:22PM +0100, Paul Durrant wrote:
> > -----Original Message-----
> > From: Anthony PERARD <anthony.perard@citrix.com>
> > Sent: 02 April 2020 14:08
> > To: qemu-devel@nongnu.org
> > Cc: qemu-stable@nongnu.org; Anthony PERARD <anthony.perard@citrix.com>; Stefano Stabellini
> > <sstabellini@kernel.org>; Paul Durrant <paul@xen.org>; Stefan Hajnoczi <stefanha@redhat.com>; Kevin
> > Wolf <kwolf@redhat.com>; Max Reitz <mreitz@redhat.com>; xen-devel@lists.xenproject.org; qemu-
> > block@nongnu.org
> > Subject: [PATCH for-5.0] xen-block: Fix double qlist remove
> > 
> > Commit a31ca6801c02 ("qemu/queue.h: clear linked list pointers on
> > remove") revealed that a request was removed twice from a list, once
> > in xen_block_finish_request() and a second time in
> > xen_block_release_request() when both function are called from
> > xen_block_complete_aio(). But also, the `requests_inflight' counter is
> > decreased twice, and thus became negative.
> > 
> > This is a bug that was introduced in bfd0d6366043, where a `finished'
> > list was removed.
> > 
> > This patch simply re-add the `finish' parameter of
> > xen_block_release_request() so that we can distinguish when we need to
> > remove a request from the inflight list and when not.
> > 
> > Fixes: bfd0d6366043 ("xen-block: improve response latency")
> > Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
> 
> It looks to me like it would just be more straightforward to simply drop the QLIST_REMOVE and requests_inflight-- from
> xen_block_release_request() and simply insist that xen_block_finish_request() is called in all cases (which I think means adding one
> extra call to it in xen_block_handle_requests()).

I'm thinking of going further than that. I've notice another bug, in
case of error in xen_block_do_aio(), xen_block_finish_request() is
called without ever calling send_response() or release_request(). I
think that mean a leak of request.

So, I'm thinking of creating a function that would do finish_request(),
send_response(), release_request(), has I believe those operations needs
to be done together anyway.

I'll rework the patch.
diff mbox series

Patch

diff --git a/hw/block/dataplane/xen-block.c b/hw/block/dataplane/xen-block.c
index 288a87a814ad..6cc089fc561f 100644
--- a/hw/block/dataplane/xen-block.c
+++ b/hw/block/dataplane/xen-block.c
@@ -123,15 +123,19 @@  static void xen_block_finish_request(XenBlockRequest *request)
     dataplane->requests_inflight--;
 }
 
-static void xen_block_release_request(XenBlockRequest *request)
+static void xen_block_release_request(XenBlockRequest *request, bool finish)
 {
     XenBlockDataPlane *dataplane = request->dataplane;
 
-    QLIST_REMOVE(request, list);
+    if (!finish) {
+        QLIST_REMOVE(request, list);
+    }
     reset_request(request);
     request->dataplane = dataplane;
     QLIST_INSERT_HEAD(&dataplane->freelist, request, list);
-    dataplane->requests_inflight--;
+    if (!finish) {
+        dataplane->requests_inflight--;
+    }
 }
 
 /*
@@ -316,7 +320,7 @@  static void xen_block_complete_aio(void *opaque, int ret)
             error_report_err(local_err);
         }
     }
-    xen_block_release_request(request);
+    xen_block_release_request(request, true);
 
     if (dataplane->more_work) {
         qemu_bh_schedule(dataplane->bh);
@@ -585,7 +589,7 @@  static bool xen_block_handle_requests(XenBlockDataPlane *dataplane)
                     error_report_err(local_err);
                 }
             }
-            xen_block_release_request(request);
+            xen_block_release_request(request, false);
             continue;
         }