Message ID | 1453307106-28330-9-git-send-email-kwolf@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi, this commit breaks incoming migration case: qemu-system-ppc64 XXX -incoming tcp:0:4444 qemu-system-ppc64: .../qemu/block/io.c:1304: bdrv_co_do_pwritev: Assertion `!(bs->open_flags & 0x0800)' failed. Without "-incoming", the same command line boots fine. Are you aware of the problem? Laurent On 20/01/2016 17:24, Kevin Wolf wrote: > As long as BDRV_O_INCOMING is set, the image file is only opened so we > have a file descriptor for it. We're definitely not supposed to modify > the image, it's still owned by the migration source. > > Signed-off-by: Kevin Wolf <kwolf@redhat.com> > Reviewed-by: Eric Blake <eblake@redhat.com> > --- > block/io.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/block/io.c b/block/io.c > index 707c04b..2372994 100644 > --- a/block/io.c > +++ b/block/io.c > @@ -1301,6 +1301,7 @@ static int coroutine_fn bdrv_co_do_pwritev(BlockDriverState *bs, > if (bs->read_only) { > return -EPERM; > } > + assert(!(bs->open_flags & BDRV_O_INCOMING)); > > ret = bdrv_check_byte_request(bs, offset, bytes); > if (ret < 0) { > @@ -2462,6 +2463,7 @@ int coroutine_fn bdrv_co_discard(BlockDriverState *bs, int64_t sector_num, > } else if (bs->read_only) { > return -EPERM; > } > + assert(!(bs->open_flags & BDRV_O_INCOMING)); > > /* Do nothing if disabled. */ > if (!(bs->open_flags & BDRV_O_UNMAP)) { >
Am 18.02.2016 um 17:03 hat Laurent Vivier geschrieben: > Hi, > > this commit breaks incoming migration case: > > qemu-system-ppc64 XXX -incoming tcp:0:4444 > qemu-system-ppc64: .../qemu/block/io.c:1304: bdrv_co_do_pwritev: > Assertion `!(bs->open_flags & 0x0800)' failed. > > Without "-incoming", the same command line boots fine. > > Are you aware of the problem? No, and with no other command line options it works for me: $ ppc64-softmmu/qemu-system-ppc64 -incoming tcp::4444 I have also tested somewhat more realistic cases on x86 (including performing actual migrations) before committing the patch, so it's not generally broken. Can you please give the full command line and the stack backtrace? I think you're seeing a bug where some PPC hardware tries to read (or access anyway) the image before the migration has completed. This is for obvious reasons wrong (the contents may still change until the VM is handed over). Kevin
On 19/02/2016 10:17, Kevin Wolf wrote: > Am 18.02.2016 um 17:03 hat Laurent Vivier geschrieben: >> Hi, >> >> this commit breaks incoming migration case: >> >> qemu-system-ppc64 XXX -incoming tcp:0:4444 >> qemu-system-ppc64: .../qemu/block/io.c:1304: bdrv_co_do_pwritev: >> Assertion `!(bs->open_flags & 0x0800)' failed. >> >> Without "-incoming", the same command line boots fine. >> >> Are you aware of the problem? > > No, and with no other command line options it works for me: > > $ ppc64-softmmu/qemu-system-ppc64 -incoming tcp::4444 > > I have also tested somewhat more realistic cases on x86 (including > performing actual migrations) before committing the patch, so it's not > generally broken. > > Can you please give the full command line and the stack backtrace? I > think you're seeing a bug where some PPC hardware tries to read (or > access anyway) the image before the migration has completed. This is for > obvious reasons wrong (the contents may still change until the VM is > handed over). It happens before the migration has started, as soon as I start qemu with "-incoming" parameter. The simplified command line is: qemu-system-ppc64 -drive file=rhel7.2-le.qcow2,format=qcow2,snapshot=on -incoming tcp:0:4444 The problem seems coming from "snapshot=on". Laurent
Am 19.02.2016 um 11:37 hat Laurent Vivier geschrieben: > > > On 19/02/2016 10:17, Kevin Wolf wrote: > > Am 18.02.2016 um 17:03 hat Laurent Vivier geschrieben: > >> Hi, > >> > >> this commit breaks incoming migration case: > >> > >> qemu-system-ppc64 XXX -incoming tcp:0:4444 > >> qemu-system-ppc64: .../qemu/block/io.c:1304: bdrv_co_do_pwritev: > >> Assertion `!(bs->open_flags & 0x0800)' failed. > >> > >> Without "-incoming", the same command line boots fine. > >> > >> Are you aware of the problem? > > > > No, and with no other command line options it works for me: > > > > $ ppc64-softmmu/qemu-system-ppc64 -incoming tcp::4444 > > > > I have also tested somewhat more realistic cases on x86 (including > > performing actual migrations) before committing the patch, so it's not > > generally broken. > > > > Can you please give the full command line and the stack backtrace? I > > think you're seeing a bug where some PPC hardware tries to read (or > > access anyway) the image before the migration has completed. This is for > > obvious reasons wrong (the contents may still change until the VM is > > handed over). > > It happens before the migration has started, as soon as I start qemu > with "-incoming" parameter. > > The simplified command line is: > > qemu-system-ppc64 > -drive file=rhel7.2-le.qcow2,format=qcow2,snapshot=on > -incoming tcp:0:4444 > > The problem seems coming from "snapshot=on". Aha. The combination of migration with snapshot=on is obvious nonsense (because on the destination the disk state would be reverted while the guest OS thinks the disk is unchanged), so we should probably take a migration blocker when an image with snapshot=on is opened. Kevin
On 19/02/2016 14:47, Kevin Wolf wrote: > > The problem seems coming from "snapshot=on". > Aha. The combination of migration with snapshot=on is obvious nonsense > (because on the destination the disk state would be reverted while the > guest OS thinks the disk is unchanged), Not really, on the destination the disk state will not be _further updated_, but no reversion happens. Migration with "snapshot=on" makes sense if you are resuming a VM from a saved checkpoint ("-incoming exec"). You can use it to debug a migration load, for example, because you want the resumed guest to always start with the same disk contents. It's somewhat weird, but it definitely has its uses. A blocker would be a pain in the ass... Paolo
Am 19.02.2016 um 15:11 hat Paolo Bonzini geschrieben: > On 19/02/2016 14:47, Kevin Wolf wrote: > > > The problem seems coming from "snapshot=on". > > Aha. The combination of migration with snapshot=on is obvious nonsense > > (because on the destination the disk state would be reverted while the > > guest OS thinks the disk is unchanged), > > Not really, on the destination the disk state will not be _further > updated_, but no reversion happens. Right, I started writing a patch when I noticed that that wasn't quite right and -snapshot on the destination is somewhat unusual, but not in all cases wrong. > Migration with "snapshot=on" makes sense if you are resuming a VM from a > saved checkpoint ("-incoming exec"). You can use it to debug a > migration load, for example, because you want the resumed guest to > always start with the same disk contents. It's somewhat weird, but it > definitely has its uses. A blocker would be a pain in the ass... My other thinko was that a blocker would help. In reality, blockers are on the source host. They would make a little more sense than forbidding -snapshot with -incoming, but we don't know how you migrate storage. If you don't use shared storage but mirroring, then it could be okay again (but probably even weirder than the other case). Kevin
diff --git a/block/io.c b/block/io.c index 707c04b..2372994 100644 --- a/block/io.c +++ b/block/io.c @@ -1301,6 +1301,7 @@ static int coroutine_fn bdrv_co_do_pwritev(BlockDriverState *bs, if (bs->read_only) { return -EPERM; } + assert(!(bs->open_flags & BDRV_O_INCOMING)); ret = bdrv_check_byte_request(bs, offset, bytes); if (ret < 0) { @@ -2462,6 +2463,7 @@ int coroutine_fn bdrv_co_discard(BlockDriverState *bs, int64_t sector_num, } else if (bs->read_only) { return -EPERM; } + assert(!(bs->open_flags & BDRV_O_INCOMING)); /* Do nothing if disabled. */ if (!(bs->open_flags & BDRV_O_UNMAP)) {