diff mbox

[for-2.9] blockjob: avoid recursive AioContext locking

Message ID 1490118490-5597-1-git-send-email-pbonzini@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Paolo Bonzini March 21, 2017, 5:48 p.m. UTC
Streaming or any other block job hangs when performed on a block device
that has a non-default iothread.  This happens because the AioContext
is acquired twice by block_job_defer_to_main_loop_bh and then released
only once by BDRV_POLL_WHILE.  (Insert rants on recursive mutexes, which

unfortunately are a temporary but necessary evil for iothreads at the
moment).

Luckily, the reason for the double acquisition is simple; the function
acquires the AioContext for both the job iothread and the BDS iothread,
in case the BDS iothread was changed while the job was running.  It
is therefore enough to skip the second acquisition when the two
AioContexts are one and the same.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 blockjob.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

Comments

Eric Blake March 21, 2017, 7:15 p.m. UTC | #1
On 03/21/2017 12:48 PM, Paolo Bonzini wrote:
> Streaming or any other block job hangs when performed on a block device
> that has a non-default iothread.  This happens because the AioContext
> is acquired twice by block_job_defer_to_main_loop_bh and then released
> only once by BDRV_POLL_WHILE.  (Insert rants on recursive mutexes, which
> 
> unfortunately are a temporary but necessary evil for iothreads at the

Why the blank line?

> moment).
> 
> Luckily, the reason for the double acquisition is simple; the function
> acquires the AioContext for both the job iothread and the BDS iothread,
> in case the BDS iothread was changed while the job was running.  It
> is therefore enough to skip the second acquisition when the two
> AioContexts are one and the same.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  blockjob.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 

Makes sense from the description.
Reviewed-by: Eric Blake <eblake@redhat.com>
Jeff Cody March 22, 2017, 12:05 p.m. UTC | #2
On Tue, Mar 21, 2017 at 06:48:10PM +0100, Paolo Bonzini wrote:
> Streaming or any other block job hangs when performed on a block device
> that has a non-default iothread.  This happens because the AioContext
> is acquired twice by block_job_defer_to_main_loop_bh and then released
> only once by BDRV_POLL_WHILE.  (Insert rants on recursive mutexes, which
> 
> unfortunately are a temporary but necessary evil for iothreads at the
> moment).
> 
> Luckily, the reason for the double acquisition is simple; the function
> acquires the AioContext for both the job iothread and the BDS iothread,
> in case the BDS iothread was changed while the job was running.  It
> is therefore enough to skip the second acquisition when the two
> AioContexts are one and the same.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  blockjob.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/blockjob.c b/blockjob.c
> index 69126af..2159df7 100644
> --- a/blockjob.c
> +++ b/blockjob.c
> @@ -755,12 +755,16 @@ static void block_job_defer_to_main_loop_bh(void *opaque)
>  
>      /* Fetch BDS AioContext again, in case it has changed */
>      aio_context = blk_get_aio_context(data->job->blk);
> -    aio_context_acquire(aio_context);
> +    if (aio_context != data->aio_context) {
> +        aio_context_acquire(aio_context);
> +    }
>  
>      data->job->deferred_to_main_loop = false;
>      data->fn(data->job, data->opaque);
>  
> -    aio_context_release(aio_context);
> +    if (aio_context != data->aio_context) {
> +        aio_context_release(aio_context);
> +    }
>  
>      aio_context_release(data->aio_context);
>  
> -- 
> 1.8.3.1
> 
>

Reviewed-by: Jeff Cody <jcody@redhat.com>
Jeff Cody March 22, 2017, 12:15 p.m. UTC | #3
On Tue, Mar 21, 2017 at 06:48:10PM +0100, Paolo Bonzini wrote:
> Streaming or any other block job hangs when performed on a block device
> that has a non-default iothread.  This happens because the AioContext
> is acquired twice by block_job_defer_to_main_loop_bh and then released
> only once by BDRV_POLL_WHILE.  (Insert rants on recursive mutexes, which
> 
> unfortunately are a temporary but necessary evil for iothreads at the
> moment).
> 
> Luckily, the reason for the double acquisition is simple; the function
> acquires the AioContext for both the job iothread and the BDS iothread,
> in case the BDS iothread was changed while the job was running.  It
> is therefore enough to skip the second acquisition when the two
> AioContexts are one and the same.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  blockjob.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/blockjob.c b/blockjob.c
> index 69126af..2159df7 100644
> --- a/blockjob.c
> +++ b/blockjob.c
> @@ -755,12 +755,16 @@ static void block_job_defer_to_main_loop_bh(void *opaque)
>  
>      /* Fetch BDS AioContext again, in case it has changed */
>      aio_context = blk_get_aio_context(data->job->blk);
> -    aio_context_acquire(aio_context);
> +    if (aio_context != data->aio_context) {
> +        aio_context_acquire(aio_context);
> +    }
>  
>      data->job->deferred_to_main_loop = false;
>      data->fn(data->job, data->opaque);
>  
> -    aio_context_release(aio_context);
> +    if (aio_context != data->aio_context) {
> +        aio_context_release(aio_context);
> +    }
>  
>      aio_context_release(data->aio_context);
>  
> -- 
> 1.8.3.1
> 
>

Deleted the blank line in the commit message, and:


Thanks,

Applied to my block branch:

git://github.com/codyprime/qemu-kvm-jtc.git block

-Jeff
John Snow March 22, 2017, 3:32 p.m. UTC | #4
On 03/21/2017 01:48 PM, Paolo Bonzini wrote:
> Streaming or any other block job hangs when performed on a block device
> that has a non-default iothread.  This happens because the AioContext
> is acquired twice by block_job_defer_to_main_loop_bh and then released
> only once by BDRV_POLL_WHILE.  (Insert rants on recursive mutexes, which
> 
> unfortunately are a temporary but necessary evil for iothreads at the
> moment).
> 
> Luckily, the reason for the double acquisition is simple; the function
> acquires the AioContext for both the job iothread and the BDS iothread,
> in case the BDS iothread was changed while the job was running.  It
> is therefore enough to skip the second acquisition when the two
> AioContexts are one and the same.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  blockjob.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/blockjob.c b/blockjob.c
> index 69126af..2159df7 100644
> --- a/blockjob.c
> +++ b/blockjob.c
> @@ -755,12 +755,16 @@ static void block_job_defer_to_main_loop_bh(void *opaque)
>  
>      /* Fetch BDS AioContext again, in case it has changed */
>      aio_context = blk_get_aio_context(data->job->blk);
> -    aio_context_acquire(aio_context);
> +    if (aio_context != data->aio_context) {
> +        aio_context_acquire(aio_context);
> +    }
>  
>      data->job->deferred_to_main_loop = false;
>      data->fn(data->job, data->opaque);
>  
> -    aio_context_release(aio_context);
> +    if (aio_context != data->aio_context) {
> +        aio_context_release(aio_context);
> +    }
>  
>      aio_context_release(data->aio_context);
>  
> 

Reviewed-by: John Snow <jsnow@redhat.com>
diff mbox

Patch

diff --git a/blockjob.c b/blockjob.c
index 69126af..2159df7 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -755,12 +755,16 @@  static void block_job_defer_to_main_loop_bh(void *opaque)
 
     /* Fetch BDS AioContext again, in case it has changed */
     aio_context = blk_get_aio_context(data->job->blk);
-    aio_context_acquire(aio_context);
+    if (aio_context != data->aio_context) {
+        aio_context_acquire(aio_context);
+    }
 
     data->job->deferred_to_main_loop = false;
     data->fn(data->job, data->opaque);
 
-    aio_context_release(aio_context);
+    if (aio_context != data->aio_context) {
+        aio_context_release(aio_context);
+    }
 
     aio_context_release(data->aio_context);