diff mbox

[v3] blockjob: Fix hang in block_job_finish_sync

Message ID 1454033989-16996-1-git-send-email-famz@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Fam Zheng Jan. 29, 2016, 2:19 a.m. UTC
With a mirror job running on a virtio-blk dataplane disk, sending "q" to
HMP will cause a dead loop in block_job_finish_sync.

This is because the aio_poll() only processes the AIO context of bs
which has no more work to do, while the main loop BH that is scheduled
for setting the job->completed flag is never processed.

Fix this by adding a flag in BlockJob structure, to track which context
to poll for the block job to make progress. Its value is set to true
when block_job_coroutine_complete() is called, and is checked in
block_job_finish_sync to determine which context to poll.

Suggested-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
---
 blockjob.c               | 5 ++++-
 include/block/blockjob.h | 9 +++++++++
 2 files changed, 13 insertions(+), 1 deletion(-)

Comments

Stefan Hajnoczi Jan. 29, 2016, 11:31 a.m. UTC | #1
On Fri, Jan 29, 2016 at 10:19:49AM +0800, Fam Zheng wrote:
> @@ -402,6 +407,10 @@ typedef void BlockJobDeferToMainLoopFn(BlockJob *job, void *opaque);
>   * AioContext acquired.  Block jobs must call bdrv_unref(), bdrv_close(), and
>   * anything that uses bdrv_drain_all() in the main loop.
>   *
> + * The job->deferred_to_main_loop flag will be set. Caller must clear it once
> + * the deferred work is done and the block job coroutine continues, unless it's
> + * completing immediately.
> + *

It's not necessary to expose job->deferred_to_main_loop to the user.
Just clear it:

static void block_job_defer_to_main_loop_bh(void *opaque)
{
    BlockJobDeferToMainLoopData *data = opaque;
    AioContext *aio_context;

    qemu_bh_delete(data->bh);

    /* Prevent race with block_job_defer_to_main_loop() */
    aio_context_acquire(data->aio_context);

    /* Fetch BDS AioContext again, in case it has changed */
    aio_context = bdrv_get_aio_context(data->job->bs);
    aio_context_acquire(aio_context);

    data->fn(data->job, data->opaque);
    job->deferred_to_main_loop = false;  /* <----- HERE */

    aio_context_release(aio_context);

    aio_context_release(data->aio_context);

    g_free(data);
}
Fam Zheng Feb. 1, 2016, 2:49 a.m. UTC | #2
On Fri, 01/29 11:31, Stefan Hajnoczi wrote:
> On Fri, Jan 29, 2016 at 10:19:49AM +0800, Fam Zheng wrote:
> > @@ -402,6 +407,10 @@ typedef void BlockJobDeferToMainLoopFn(BlockJob *job, void *opaque);
> >   * AioContext acquired.  Block jobs must call bdrv_unref(), bdrv_close(), and
> >   * anything that uses bdrv_drain_all() in the main loop.
> >   *
> > + * The job->deferred_to_main_loop flag will be set. Caller must clear it once
> > + * the deferred work is done and the block job coroutine continues, unless it's
> > + * completing immediately.
> > + *
> 
> It's not necessary to expose job->deferred_to_main_loop to the user.
> Just clear it:
> 
> static void block_job_defer_to_main_loop_bh(void *opaque)
> {
>     BlockJobDeferToMainLoopData *data = opaque;
>     AioContext *aio_context;
> 
>     qemu_bh_delete(data->bh);
> 
>     /* Prevent race with block_job_defer_to_main_loop() */
>     aio_context_acquire(data->aio_context);
> 
>     /* Fetch BDS AioContext again, in case it has changed */
>     aio_context = bdrv_get_aio_context(data->job->bs);
>     aio_context_acquire(aio_context);
> 
>     data->fn(data->job, data->opaque);
>     job->deferred_to_main_loop = false;  /* <----- HERE */

Maybe move one line above in case data->fn() does another
block_job_defer_to_main_loop()?

Fam

> 
>     aio_context_release(aio_context);
> 
>     aio_context_release(data->aio_context);
> 
>     g_free(data);
> }
Stefan Hajnoczi Feb. 1, 2016, 11:36 a.m. UTC | #3
On Mon, Feb 01, 2016 at 10:49:00AM +0800, Fam Zheng wrote:
> On Fri, 01/29 11:31, Stefan Hajnoczi wrote:
> > On Fri, Jan 29, 2016 at 10:19:49AM +0800, Fam Zheng wrote:
> > > @@ -402,6 +407,10 @@ typedef void BlockJobDeferToMainLoopFn(BlockJob *job, void *opaque);
> > >   * AioContext acquired.  Block jobs must call bdrv_unref(), bdrv_close(), and
> > >   * anything that uses bdrv_drain_all() in the main loop.
> > >   *
> > > + * The job->deferred_to_main_loop flag will be set. Caller must clear it once
> > > + * the deferred work is done and the block job coroutine continues, unless it's
> > > + * completing immediately.
> > > + *
> > 
> > It's not necessary to expose job->deferred_to_main_loop to the user.
> > Just clear it:
> > 
> > static void block_job_defer_to_main_loop_bh(void *opaque)
> > {
> >     BlockJobDeferToMainLoopData *data = opaque;
> >     AioContext *aio_context;
> > 
> >     qemu_bh_delete(data->bh);
> > 
> >     /* Prevent race with block_job_defer_to_main_loop() */
> >     aio_context_acquire(data->aio_context);
> > 
> >     /* Fetch BDS AioContext again, in case it has changed */
> >     aio_context = bdrv_get_aio_context(data->job->bs);
> >     aio_context_acquire(aio_context);
> > 
> >     data->fn(data->job, data->opaque);
> >     job->deferred_to_main_loop = false;  /* <----- HERE */
> 
> Maybe move one line above in case data->fn() does another
> block_job_defer_to_main_loop()?

Yes, good point.  Thanks for spotting the bug.

It's safe to clear the boolean as soon as we acquire aio_context.

Stefan
diff mbox

Patch

diff --git a/blockjob.c b/blockjob.c
index 80adb9d..25e1581 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -304,7 +304,9 @@  static int block_job_finish_sync(BlockJob *job,
         return -EBUSY;
     }
     while (!job->completed) {
-        aio_poll(bdrv_get_aio_context(bs), true);
+        aio_poll(job->deferred_to_main_loop ? qemu_get_aio_context() :
+                                              bdrv_get_aio_context(bs),
+                 true);
     }
     ret = (job->cancelled && job->ret == 0) ? -ECANCELED : job->ret;
     block_job_unref(job);
@@ -497,6 +499,7 @@  void block_job_defer_to_main_loop(BlockJob *job,
     data->aio_context = bdrv_get_aio_context(job->bs);
     data->fn = fn;
     data->opaque = opaque;
+    job->deferred_to_main_loop = true;
 
     qemu_bh_schedule(data->bh);
 }
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index d84ccd8..550de26 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -130,6 +130,11 @@  struct BlockJob {
      */
     bool ready;
 
+    /**
+     * Set to true when the job has deferred work to the main loop.
+     */
+    bool deferred_to_main_loop;
+
     /** Status that is published by the query-block-jobs QMP API */
     BlockDeviceIoStatus iostatus;
 
@@ -402,6 +407,10 @@  typedef void BlockJobDeferToMainLoopFn(BlockJob *job, void *opaque);
  * AioContext acquired.  Block jobs must call bdrv_unref(), bdrv_close(), and
  * anything that uses bdrv_drain_all() in the main loop.
  *
+ * The job->deferred_to_main_loop flag will be set. Caller must clear it once
+ * the deferred work is done and the block job coroutine continues, unless it's
+ * completing immediately.
+ *
  * The @job AioContext is held while @fn executes.
  */
 void block_job_defer_to_main_loop(BlockJob *job,