Message ID | 50584D84.2080802@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 09/18/12 12:31, Kevin Wolf wrote: > Am 18.09.2012 12:28, schrieb Peter Lieven: >> On 09/17/12 22:12, Peter Lieven wrote: >>> On 09/17/12 10:41, Kevin Wolf wrote: >>>> Am 16.09.2012 12:13, schrieb Peter Lieven: >>>>> Hi, >>>>> >>>>> when trying to block migrate a VM from one node to another, the source >>>>> VM crashed with the following assertion: >>>>> block.c:3829: bdrv_set_in_use: Assertion `bs->in_use != in_use' failed. >>>>> >>>>> Is this sth already addresses/known? >>>> Not that I'm aware of, at least. >>>> >>>> Block migration doesn't seem to check whether the device is already in >>>> use, maybe this is the problem. Not sure why it would be in use, though, >>>> and in my quick test it didn't crash. >>>> >>>> So we need some more information: What's you command line, did you do >>>> anything specific in the monitor with block devices, what does the >>>> stacktrace look like, etc.? >>> kevin, it seems that i can very easily force a crash if I cancel a >>> running block migration. >> if I understand correctly what happens there are aio callbacks coming in >> after >> blk_mig_cleanup() has been called. >> >> what is the proper way to detect this in blk_mig_read_cb()? > You could try this, it doesn't detect the situation in > blk_mig_read_cb(), but ensures that all callbacks happen before we do > the actual cleanup (completely untested): after testing it for half an hour i can say, it seems to fix the problem. no segfaults and also no other assertions. while searching I have seen that the queses blk_list and bmds_list are initialized at qemu startup. wouldn't it be better to initialize them at init_blk_migration or at least check that they are really empty? i have also seen that prev_time_offset is not initialized. thank you, peter sth like this: --- qemu-kvm-1.2.0/block-migration.c.orig 2012-09-17 21:14:44.458429855 +0200 +++ qemu-kvm-1.2.0/block-migration.c 2012-09-17 21:15:40.599736962 +0200 @@ -311,8 +311,12 @@ static void init_blk_migration(QEMUFile block_mig_state.prev_progress = -1; block_mig_state.bulk_completed = 0; block_mig_state.total_time = 0; + block_mig_state.prev_time_offset = 0; block_mig_state.reads = 0; + QSIMPLEQ_INIT(&block_mig_state.bmds_list); + QSIMPLEQ_INIT(&block_mig_state.blk_list); + bdrv_iterate(init_blk_migration_it, NULL); } @@ -760,9 +764,6 @@ SaveVMHandlers savevm_block_handlers = { void blk_mig_init(void) { - QSIMPLEQ_INIT(&block_mig_state.bmds_list); - QSIMPLEQ_INIT(&block_mig_state.blk_list); - register_savevm_live(NULL, "block", 0, 1, &savevm_block_handlers, &block_mig_state); } > diff --git a/block-migration.c b/block-migration.c > index 7def8ab..ed93301 100644 > --- a/block-migration.c > +++ b/block-migration.c > @@ -519,6 +519,8 @@ static void blk_mig_cleanup(void) > BlkMigDevState *bmds; > BlkMigBlock *blk; > > + bdrv_drain_all(); > + > set_dirty_tracking(0); > > while ((bmds = QSIMPLEQ_FIRST(&block_mig_state.bmds_list)) != NULL) { -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Am 19.09.2012 07:49, schrieb Peter Lieven: > On 09/18/12 12:31, Kevin Wolf wrote: >> Am 18.09.2012 12:28, schrieb Peter Lieven: >>> On 09/17/12 22:12, Peter Lieven wrote: >>>> On 09/17/12 10:41, Kevin Wolf wrote: >>>>> Am 16.09.2012 12:13, schrieb Peter Lieven: >>>>>> Hi, >>>>>> >>>>>> when trying to block migrate a VM from one node to another, the source >>>>>> VM crashed with the following assertion: >>>>>> block.c:3829: bdrv_set_in_use: Assertion `bs->in_use != in_use' failed. >>>>>> >>>>>> Is this sth already addresses/known? >>>>> Not that I'm aware of, at least. >>>>> >>>>> Block migration doesn't seem to check whether the device is already in >>>>> use, maybe this is the problem. Not sure why it would be in use, though, >>>>> and in my quick test it didn't crash. >>>>> >>>>> So we need some more information: What's you command line, did you do >>>>> anything specific in the monitor with block devices, what does the >>>>> stacktrace look like, etc.? >>>> kevin, it seems that i can very easily force a crash if I cancel a >>>> running block migration. >>> if I understand correctly what happens there are aio callbacks coming in >>> after >>> blk_mig_cleanup() has been called. >>> >>> what is the proper way to detect this in blk_mig_read_cb()? >> You could try this, it doesn't detect the situation in >> blk_mig_read_cb(), but ensures that all callbacks happen before we do >> the actual cleanup (completely untested): > after testing it for half an hour i can say, it seems to fix the problem. > no segfaults and also no other assertions. > > while searching I have seen that the queses blk_list and bmds_list are > initialized at > qemu startup. wouldn't it be better to initialize them at init_blk_migration > or at least check that they are really empty? i have also seen that > prev_time_offset > is not initialized. Probably. If you sent this as a proper patch with a SoB, I wouldn't reject it, but considering that block migration is deprecated anyway, I won't bother myself as long as there's no real bug. Kevin -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/block-migration.c b/block-migration.c index 7def8ab..ed93301 100644 --- a/block-migration.c +++ b/block-migration.c @@ -519,6 +519,8 @@ static void blk_mig_cleanup(void) BlkMigDevState *bmds; BlkMigBlock *blk; + bdrv_drain_all(); + set_dirty_tracking(0); while ((bmds = QSIMPLEQ_FIRST(&block_mig_state.bmds_list)) != NULL) {