diff mbox series

blockdev-backup: don't check aio_context too early

Message ID 20190506203344.30781-1-jsnow@redhat.com (mailing list archive)
State New, archived
Headers show
Series blockdev-backup: don't check aio_context too early | expand

Commit Message

John Snow May 6, 2019, 8:33 p.m. UTC
in blockdev_backup_prepare, we check to make sure that the target is
associated with a compatible aio context. However, do_blockdev_backup is
called later and has some logic to move the target to a compatible
aio_context. The transaction version will fail certain commands
needlessly early as a result.

Allow blockdev_backup_prepare to simply call do_blockdev_backup, which
will ultimately decide if the contexts are compatible or not.

Note: the transaction version has always disallowed this operation since
its initial commit bd8baecd (2014), whereas the version of
qmp_blockdev_backup at the time, from commit c29c1dd312f, tried to
enforce the aio_context switch instead. It's not clear, and I can't see
from the mailing list archives at the time, why the two functions take a
different approach. It wasn't until later in efd7556708b (2016) that the
standalone version tried to determine if it could set the context or
not.

Reported-by: aihua liang <aliang@redhat.com>
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1683498
---
 blockdev.c | 4 ----
 1 file changed, 4 deletions(-)

Comments

Kevin Wolf May 7, 2019, 8:50 a.m. UTC | #1
Am 06.05.2019 um 22:33 hat John Snow geschrieben:
> in blockdev_backup_prepare, we check to make sure that the target is
> associated with a compatible aio context. However, do_blockdev_backup is
> called later and has some logic to move the target to a compatible
> aio_context. The transaction version will fail certain commands
> needlessly early as a result.
> 
> Allow blockdev_backup_prepare to simply call do_blockdev_backup, which
> will ultimately decide if the contexts are compatible or not.
> 
> Note: the transaction version has always disallowed this operation since
> its initial commit bd8baecd (2014), whereas the version of
> qmp_blockdev_backup at the time, from commit c29c1dd312f, tried to
> enforce the aio_context switch instead. It's not clear, and I can't see
> from the mailing list archives at the time, why the two functions take a
> different approach. It wasn't until later in efd7556708b (2016) that the
> standalone version tried to determine if it could set the context or
> not.
> 
> Reported-by: aihua liang <aliang@redhat.com>
> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1683498

Signed-off-by is missing, and a testcase, too. :-)

> diff --git a/blockdev.c b/blockdev.c
> index 79fbac8450..a81d88980c 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -1872,10 +1872,6 @@ static void blockdev_backup_prepare(BlkActionState *common, Error **errp)
>      }
>  
>      aio_context = bdrv_get_aio_context(bs);
> -    if (aio_context != bdrv_get_aio_context(target)) {
> -        error_setg(errp, "Backup between two IO threads is not implemented");
> -        return;
> -    }
>      aio_context_acquire(aio_context);
>      state->bs = bs;

The actual change looks good to me.

Kevin
John Snow May 7, 2019, 2:19 p.m. UTC | #2
On 5/7/19 4:50 AM, Kevin Wolf wrote:
> Am 06.05.2019 um 22:33 hat John Snow geschrieben:
>> in blockdev_backup_prepare, we check to make sure that the target is
>> associated with a compatible aio context. However, do_blockdev_backup is
>> called later and has some logic to move the target to a compatible
>> aio_context. The transaction version will fail certain commands
>> needlessly early as a result.
>>
>> Allow blockdev_backup_prepare to simply call do_blockdev_backup, which
>> will ultimately decide if the contexts are compatible or not.
>>
>> Note: the transaction version has always disallowed this operation since
>> its initial commit bd8baecd (2014), whereas the version of
>> qmp_blockdev_backup at the time, from commit c29c1dd312f, tried to
>> enforce the aio_context switch instead. It's not clear, and I can't see
>> from the mailing list archives at the time, why the two functions take a
>> different approach. It wasn't until later in efd7556708b (2016) that the
>> standalone version tried to determine if it could set the context or
>> not.
>>
>> Reported-by: aihua liang <aliang@redhat.com>
>> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1683498
> 
> Signed-off-by is missing, and a testcase, too. :-)
> 

I am apparently exceedingly good at breaking git-publish, sorry about that.

>> diff --git a/blockdev.c b/blockdev.c
>> index 79fbac8450..a81d88980c 100644
>> --- a/blockdev.c
>> +++ b/blockdev.c
>> @@ -1872,10 +1872,6 @@ static void blockdev_backup_prepare(BlkActionState *common, Error **errp)
>>      }
>>  
>>      aio_context = bdrv_get_aio_context(bs);
>> -    if (aio_context != bdrv_get_aio_context(target)) {
>> -        error_setg(errp, "Backup between two IO threads is not implemented");
>> -        return;
>> -    }
>>      aio_context_acquire(aio_context);
>>      state->bs = bs;
> 
> The actual change looks good to me.
> 
> Kevin
> 

As far as the test case goes, I'll see if I can reproduce it without a
real guest. If you test it with no guest and empty drives with the VM
paused, the error never gets thrown. Maybe it's enough to start an empty
guest and do some junk writes to make the bitmap non-empty.

Thanks,

--js
John Snow May 8, 2019, 10:55 p.m. UTC | #3
On 5/7/19 4:50 AM, Kevin Wolf wrote:
> Am 06.05.2019 um 22:33 hat John Snow geschrieben:
>> in blockdev_backup_prepare, we check to make sure that the target is
>> associated with a compatible aio context. However, do_blockdev_backup is
>> called later and has some logic to move the target to a compatible
>> aio_context. The transaction version will fail certain commands
>> needlessly early as a result.
>>
>> Allow blockdev_backup_prepare to simply call do_blockdev_backup, which
>> will ultimately decide if the contexts are compatible or not.
>>
>> Note: the transaction version has always disallowed this operation since
>> its initial commit bd8baecd (2014), whereas the version of
>> qmp_blockdev_backup at the time, from commit c29c1dd312f, tried to
>> enforce the aio_context switch instead. It's not clear, and I can't see
>> from the mailing list archives at the time, why the two functions take a
>> different approach. It wasn't until later in efd7556708b (2016) that the
>> standalone version tried to determine if it could set the context or
>> not.
>>
>> Reported-by: aihua liang <aliang@redhat.com>
>> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1683498
> 
> Signed-off-by is missing, and a testcase, too. :-)
> 
>> diff --git a/blockdev.c b/blockdev.c
>> index 79fbac8450..a81d88980c 100644
>> --- a/blockdev.c
>> +++ b/blockdev.c
>> @@ -1872,10 +1872,6 @@ static void blockdev_backup_prepare(BlkActionState *common, Error **errp)
>>      }
>>  
>>      aio_context = bdrv_get_aio_context(bs);
>> -    if (aio_context != bdrv_get_aio_context(target)) {
>> -        error_setg(errp, "Backup between two IO threads is not implemented");
>> -        return;
>> -    }
>>      aio_context_acquire(aio_context);
>>      state->bs = bs;
> 
> The actual change looks good to me.
> 
> Kevin
> 

(See the Red Hat BZ for details on the test I am more or less trying to
replicate in iotests -- but the jist of it is using an iothread on one
device and not specifying one for the other, then backing up both to two
blockdev-add created nodes not attached to any device.)

Is there some trick to getting this to fail with accel=qtest? I'm
noticing that if the VM is paused or if I use accel=qtest, the iothread
contexts for the drives appear to go ... unresolved? and the test passes.

I've only had luck (so far) reproducing this with accel=kvm on a running
VM (the drives can be empty) and I don't actually know why that is just
yet -- I guess these aio_context objects get resolved at runtime?

Anyway, this seems to be a little tricky so far, maybe you have some
advice for me?

--js

(Also note: doing a full backup and an incremental backup for two
perfectly empty 64GB qcow2 drives takes over 6 seconds in total. It
probably shouldn't, right? There's something about the initial backup
that appears to take a pretty long time.)
Kevin Wolf May 9, 2019, 8 a.m. UTC | #4
Am 09.05.2019 um 00:55 hat John Snow geschrieben:
> 
> On 5/7/19 4:50 AM, Kevin Wolf wrote:
> > Am 06.05.2019 um 22:33 hat John Snow geschrieben:
> >> in blockdev_backup_prepare, we check to make sure that the target is
> >> associated with a compatible aio context. However, do_blockdev_backup is
> >> called later and has some logic to move the target to a compatible
> >> aio_context. The transaction version will fail certain commands
> >> needlessly early as a result.
> >>
> >> Allow blockdev_backup_prepare to simply call do_blockdev_backup, which
> >> will ultimately decide if the contexts are compatible or not.
> >>
> >> Note: the transaction version has always disallowed this operation since
> >> its initial commit bd8baecd (2014), whereas the version of
> >> qmp_blockdev_backup at the time, from commit c29c1dd312f, tried to
> >> enforce the aio_context switch instead. It's not clear, and I can't see
> >> from the mailing list archives at the time, why the two functions take a
> >> different approach. It wasn't until later in efd7556708b (2016) that the
> >> standalone version tried to determine if it could set the context or
> >> not.
> >>
> >> Reported-by: aihua liang <aliang@redhat.com>
> >> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1683498
> > 
> > Signed-off-by is missing, and a testcase, too. :-)
> > 
> >> diff --git a/blockdev.c b/blockdev.c
> >> index 79fbac8450..a81d88980c 100644
> >> --- a/blockdev.c
> >> +++ b/blockdev.c
> >> @@ -1872,10 +1872,6 @@ static void blockdev_backup_prepare(BlkActionState *common, Error **errp)
> >>      }
> >>  
> >>      aio_context = bdrv_get_aio_context(bs);
> >> -    if (aio_context != bdrv_get_aio_context(target)) {
> >> -        error_setg(errp, "Backup between two IO threads is not implemented");
> >> -        return;
> >> -    }
> >>      aio_context_acquire(aio_context);
> >>      state->bs = bs;
> > 
> > The actual change looks good to me.
> > 
> > Kevin
> > 
> 
> (See the Red Hat BZ for details on the test I am more or less trying to
> replicate in iotests -- but the jist of it is using an iothread on one
> device and not specifying one for the other, then backing up both to two
> blockdev-add created nodes not attached to any device.)
> 
> Is there some trick to getting this to fail with accel=qtest? I'm
> noticing that if the VM is paused or if I use accel=qtest, the iothread
> contexts for the drives appear to go ... unresolved? and the test passes.
> 
> I've only had luck (so far) reproducing this with accel=kvm on a running
> VM (the drives can be empty) and I don't actually know why that is just
> yet -- I guess these aio_context objects get resolved at runtime?
> 
> Anyway, this seems to be a little tricky so far, maybe you have some
> advice for me?

Are you using virtio-blk? I think it only actually enables dataplane
(and moves its node to an iothread) during initialisation of the guest
driver. This makes it quite annoying for testing, I've had the same
problem before.

However, virtio-scsi does what you would expect and puts the node into
the right AioContext directly on creation.

So maybe just switching to virtio-scsi solves you problem?

> (Also note: doing a full backup and an incremental backup for two
> perfectly empty 64GB qcow2 drives takes over 6 seconds in total. It
> probably shouldn't, right? There's something about the initial backup
> that appears to take a pretty long time.)

Sounds like Someone (TM) should have a closer look, yes.

Kevin
John Snow May 9, 2019, 6:31 p.m. UTC | #5
On 5/9/19 4:00 AM, Kevin Wolf wrote:
> Am 09.05.2019 um 00:55 hat John Snow geschrieben:
>>
>> On 5/7/19 4:50 AM, Kevin Wolf wrote:
>>> Am 06.05.2019 um 22:33 hat John Snow geschrieben:
>>>> in blockdev_backup_prepare, we check to make sure that the target is
>>>> associated with a compatible aio context. However, do_blockdev_backup is
>>>> called later and has some logic to move the target to a compatible
>>>> aio_context. The transaction version will fail certain commands
>>>> needlessly early as a result.
>>>>
>>>> Allow blockdev_backup_prepare to simply call do_blockdev_backup, which
>>>> will ultimately decide if the contexts are compatible or not.
>>>>
>>>> Note: the transaction version has always disallowed this operation since
>>>> its initial commit bd8baecd (2014), whereas the version of
>>>> qmp_blockdev_backup at the time, from commit c29c1dd312f, tried to
>>>> enforce the aio_context switch instead. It's not clear, and I can't see
>>>> from the mailing list archives at the time, why the two functions take a
>>>> different approach. It wasn't until later in efd7556708b (2016) that the
>>>> standalone version tried to determine if it could set the context or
>>>> not.
>>>>
>>>> Reported-by: aihua liang <aliang@redhat.com>
>>>> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1683498
>>>
>>> Signed-off-by is missing, and a testcase, too. :-)
>>>
>>>> diff --git a/blockdev.c b/blockdev.c
>>>> index 79fbac8450..a81d88980c 100644
>>>> --- a/blockdev.c
>>>> +++ b/blockdev.c
>>>> @@ -1872,10 +1872,6 @@ static void blockdev_backup_prepare(BlkActionState *common, Error **errp)
>>>>      }
>>>>  
>>>>      aio_context = bdrv_get_aio_context(bs);
>>>> -    if (aio_context != bdrv_get_aio_context(target)) {
>>>> -        error_setg(errp, "Backup between two IO threads is not implemented");
>>>> -        return;
>>>> -    }
>>>>      aio_context_acquire(aio_context);
>>>>      state->bs = bs;
>>>
>>> The actual change looks good to me.
>>>
>>> Kevin
>>>
>>
>> (See the Red Hat BZ for details on the test I am more or less trying to
>> replicate in iotests -- but the jist of it is using an iothread on one
>> device and not specifying one for the other, then backing up both to two
>> blockdev-add created nodes not attached to any device.)
>>
>> Is there some trick to getting this to fail with accel=qtest? I'm
>> noticing that if the VM is paused or if I use accel=qtest, the iothread
>> contexts for the drives appear to go ... unresolved? and the test passes.
>>
>> I've only had luck (so far) reproducing this with accel=kvm on a running
>> VM (the drives can be empty) and I don't actually know why that is just
>> yet -- I guess these aio_context objects get resolved at runtime?
>>
>> Anyway, this seems to be a little tricky so far, maybe you have some
>> advice for me?
> 
> Are you using virtio-blk? I think it only actually enables dataplane
> (and moves its node to an iothread) during initialisation of the guest
> driver. This makes it quite annoying for testing, I've had the same
> problem before.
> 
> However, virtio-scsi does what you would expect and puts the node into
> the right AioContext directly on creation.
> 
> So maybe just switching to virtio-scsi solves you problem?
> 

Aha. I will give that a try, thanks! I was indeed using virtio-blk.

>> (Also note: doing a full backup and an incremental backup for two
>> perfectly empty 64GB qcow2 drives takes over 6 seconds in total. It
>> probably shouldn't, right? There's something about the initial backup
>> that appears to take a pretty long time.)
> 
> Sounds like Someone (TM) should have a closer look, yes.
> 
> Kevin
> 

OK, I'll dig into that after I finish this piece.

Thank you,
--js
diff mbox series

Patch

diff --git a/blockdev.c b/blockdev.c
index 79fbac8450..a81d88980c 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1872,10 +1872,6 @@  static void blockdev_backup_prepare(BlkActionState *common, Error **errp)
     }
 
     aio_context = bdrv_get_aio_context(bs);
-    if (aio_context != bdrv_get_aio_context(target)) {
-        error_setg(errp, "Backup between two IO threads is not implemented");
-        return;
-    }
     aio_context_acquire(aio_context);
     state->bs = bs;