diff mbox series

migration: notify runstate immediately before vcpu stops

Message ID 1563261042-15974-1-git-send-email-yan.y.zhao@intel.com (mailing list archive)
State New, archived
Headers show
Series migration: notify runstate immediately before vcpu stops | expand

Commit Message

Yan Zhao July 16, 2019, 7:10 a.m. UTC
for some devices to do live migration, it is needed to do something
immediately before vcpu stops. add a notification here.

Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
---
 cpus.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Peter Xu July 16, 2019, 7:23 a.m. UTC | #1
On Tue, Jul 16, 2019 at 03:10:42PM +0800, Yan Zhao wrote:
> for some devices to do live migration, it is needed to do something
> immediately before vcpu stops. add a notification here.

Hi, Yan,

Could I ask for a more detailed commit message here?  E.g., what is
"some devices"?  And, what's the problem behind?

Thanks,

> 
> Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
> ---
>  cpus.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/cpus.c b/cpus.c
> index b09b702..d5d4abe 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -1068,6 +1068,7 @@ static int do_vm_stop(RunState state, bool send_stop)
>      int ret = 0;
>  
>      if (runstate_is_running()) {
> +        vm_state_notify(1, state);
>          cpu_disable_ticks();
>          pause_all_vcpus();
>          runstate_set(state);
> -- 
> 2.7.4
> 
>
Yan Zhao July 16, 2019, 7:29 a.m. UTC | #2
On Tue, Jul 16, 2019 at 03:23:16PM +0800, Peter Xu wrote:
> On Tue, Jul 16, 2019 at 03:10:42PM +0800, Yan Zhao wrote:
> > for some devices to do live migration, it is needed to do something
> > immediately before vcpu stops. add a notification here.
> 
> Hi, Yan,
> 
> Could I ask for a more detailed commit message here?  E.g., what is
> "some devices"?  And, what's the problem behind?
>
hi Peter,

Some devices refer to assigned devices, like NICs.
For assigned devices to do live migration, it is sometimes required that
source device is stopped before stopping source vcpus. vcpus can do some
final cleanups (like handling interrupt) in that case.

Thanks
Yan

> Thanks,
> 
> > 
> > Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
> > ---
> >  cpus.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/cpus.c b/cpus.c
> > index b09b702..d5d4abe 100644
> > --- a/cpus.c
> > +++ b/cpus.c
> > @@ -1068,6 +1068,7 @@ static int do_vm_stop(RunState state, bool send_stop)
> >      int ret = 0;
> >  
> >      if (runstate_is_running()) {
> > +        vm_state_notify(1, state);
> >          cpu_disable_ticks();
> >          pause_all_vcpus();
> >          runstate_set(state);
> > -- 
> > 2.7.4
> > 
> > 
> 
> -- 
> Peter Xu
Peter Xu July 16, 2019, 7:50 a.m. UTC | #3
On Tue, Jul 16, 2019 at 03:29:19AM -0400, Yan Zhao wrote:
> On Tue, Jul 16, 2019 at 03:23:16PM +0800, Peter Xu wrote:
> > On Tue, Jul 16, 2019 at 03:10:42PM +0800, Yan Zhao wrote:
> > > for some devices to do live migration, it is needed to do something
> > > immediately before vcpu stops. add a notification here.
> > 
> > Hi, Yan,
> > 
> > Could I ask for a more detailed commit message here?  E.g., what is
> > "some devices"?  And, what's the problem behind?
> >
> hi Peter,
> 
> Some devices refer to assigned devices, like NICs.
> For assigned devices to do live migration, it is sometimes required that
> source device is stopped before stopping source vcpus. vcpus can do some
> final cleanups (like handling interrupt) in that case.

I see, so this is a prerequisite of another work?

IMHO it would make more sense to have this patch to be with that
patchset, then it'll justify itself with reasoning.  Unless I
misunderstood - this single patch seems to help nothing if as a
standalone one.

Thanks,
Yan Zhao July 16, 2019, 7:57 a.m. UTC | #4
On Tue, Jul 16, 2019 at 03:50:25PM +0800, Peter Xu wrote:
> On Tue, Jul 16, 2019 at 03:29:19AM -0400, Yan Zhao wrote:
> > On Tue, Jul 16, 2019 at 03:23:16PM +0800, Peter Xu wrote:
> > > On Tue, Jul 16, 2019 at 03:10:42PM +0800, Yan Zhao wrote:
> > > > for some devices to do live migration, it is needed to do something
> > > > immediately before vcpu stops. add a notification here.
> > > 
> > > Hi, Yan,
> > > 
> > > Could I ask for a more detailed commit message here?  E.g., what is
> > > "some devices"?  And, what's the problem behind?
> > >
> > hi Peter,
> > 
> > Some devices refer to assigned devices, like NICs.
> > For assigned devices to do live migration, it is sometimes required that
> > source device is stopped before stopping source vcpus. vcpus can do some
> > final cleanups (like handling interrupt) in that case.
> 
> I see, so this is a prerequisite of another work?

Yes.
> 
> IMHO it would make more sense to have this patch to be with that
> patchset, then it'll justify itself with reasoning.  Unless I
> misunderstood - this single patch seems to help nothing if as a
> standalone one.

It would be better. but this patch alone is also somewhat general,
as it only adds an extra notification and wouldn't impact others.

Only after this patch is upstreamed, can VFIO live migration have a
second choice for those devices of special requirements.

Hope it can get understanding from you:)

Or do you think I need to change the commit message a little to unbind
from migration?

Thanks
Yan
Peter Xu July 16, 2019, 10:33 a.m. UTC | #5
On Tue, Jul 16, 2019 at 03:57:49AM -0400, Yan Zhao wrote:
> On Tue, Jul 16, 2019 at 03:50:25PM +0800, Peter Xu wrote:
> > On Tue, Jul 16, 2019 at 03:29:19AM -0400, Yan Zhao wrote:
> > > On Tue, Jul 16, 2019 at 03:23:16PM +0800, Peter Xu wrote:
> > > > On Tue, Jul 16, 2019 at 03:10:42PM +0800, Yan Zhao wrote:
> > > > > for some devices to do live migration, it is needed to do something
> > > > > immediately before vcpu stops. add a notification here.
> > > > 
> > > > Hi, Yan,
> > > > 
> > > > Could I ask for a more detailed commit message here?  E.g., what is
> > > > "some devices"?  And, what's the problem behind?
> > > >
> > > hi Peter,
> > > 
> > > Some devices refer to assigned devices, like NICs.
> > > For assigned devices to do live migration, it is sometimes required that
> > > source device is stopped before stopping source vcpus. vcpus can do some
> > > final cleanups (like handling interrupt) in that case.
> > 
> > I see, so this is a prerequisite of another work?
> 
> Yes.
> > 
> > IMHO it would make more sense to have this patch to be with that
> > patchset, then it'll justify itself with reasoning.  Unless I
> > misunderstood - this single patch seems to help nothing if as a
> > standalone one.
> 
> It would be better. but this patch alone is also somewhat general,
> as it only adds an extra notification and wouldn't impact others.
> 
> Only after this patch is upstreamed, can VFIO live migration have a
> second choice for those devices of special requirements.
> 
> Hope it can get understanding from you:)

No you don't need my understanding, as long as the maintainer likes it
it's good enough, so you probably only need to persuade the
maintainers. :)

But again for me I would prefer patch like this to simply be the first
patch of your live migration series.  There can be some exceptions
like that when the prerequisite is too big so we'd better split them
out to do things step by step, but this patch (which is a oneliner)
should not be the case.

Thanks,

> 
> Or do you think I need to change the commit message a little to unbind
> from migration?
> 
> Thanks
> Yan
>
Yan Zhao July 17, 2019, 2 a.m. UTC | #6
On Tue, Jul 16, 2019 at 06:33:41PM +0800, Peter Xu wrote:
> On Tue, Jul 16, 2019 at 03:57:49AM -0400, Yan Zhao wrote:
> > On Tue, Jul 16, 2019 at 03:50:25PM +0800, Peter Xu wrote:
> > > On Tue, Jul 16, 2019 at 03:29:19AM -0400, Yan Zhao wrote:
> > > > On Tue, Jul 16, 2019 at 03:23:16PM +0800, Peter Xu wrote:
> > > > > On Tue, Jul 16, 2019 at 03:10:42PM +0800, Yan Zhao wrote:
> > > > > > for some devices to do live migration, it is needed to do something
> > > > > > immediately before vcpu stops. add a notification here.
> > > > > 
> > > > > Hi, Yan,
> > > > > 
> > > > > Could I ask for a more detailed commit message here?  E.g., what is
> > > > > "some devices"?  And, what's the problem behind?
> > > > >
> > > > hi Peter,
> > > > 
> > > > Some devices refer to assigned devices, like NICs.
> > > > For assigned devices to do live migration, it is sometimes required that
> > > > source device is stopped before stopping source vcpus. vcpus can do some
> > > > final cleanups (like handling interrupt) in that case.
> > > 
> > > I see, so this is a prerequisite of another work?
> > 
> > Yes.
> > > 
> > > IMHO it would make more sense to have this patch to be with that
> > > patchset, then it'll justify itself with reasoning.  Unless I
> > > misunderstood - this single patch seems to help nothing if as a
> > > standalone one.
> > 
> > It would be better. but this patch alone is also somewhat general,
> > as it only adds an extra notification and wouldn't impact others.
> > 
> > Only after this patch is upstreamed, can VFIO live migration have a
> > second choice for those devices of special requirements.
> > 
> > Hope it can get understanding from you:)
> 
> No you don't need my understanding, as long as the maintainer likes it
> it's good enough, so you probably only need to persuade the
> maintainers. :)
> 
> But again for me I would prefer patch like this to simply be the first
> patch of your live migration series.  There can be some exceptions
> like that when the prerequisite is too big so we'd better split them
> out to do things step by step, but this patch (which is a oneliner)
> should not be the case.
> 
> Thanks,
>

ok. Thank you, Peter :)

> > 
> > Or do you think I need to change the commit message a little to unbind
> > from migration?
> > 
> > Thanks
> > Yan
> > 
> 
> -- 
> Peter Xu
no-reply@patchew.org July 17, 2019, 5:36 a.m. UTC | #7
Patchew URL: https://patchew.org/QEMU/1563261042-15974-1-git-send-email-yan.y.zhao@intel.com/



Hi,

This series failed the asan build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
make docker-image-fedora V=1 NETWORK=1
time make docker-test-debug@fedora TARGET_LIST=x86_64-softmmu J=14 NETWORK=1
=== TEST SCRIPT END ===

PASS 1 fdc-test /x86_64/fdc/cmos
PASS 2 fdc-test /x86_64/fdc/no_media_on_start
PASS 3 fdc-test /x86_64/fdc/read_without_media
==7852==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases!
PASS 4 fdc-test /x86_64/fdc/media_change
PASS 5 fdc-test /x86_64/fdc/sense_interrupt
PASS 6 fdc-test /x86_64/fdc/relative_seek
---
PASS 32 test-opts-visitor /visitor/opts/range/beyond
PASS 33 test-opts-visitor /visitor/opts/dict/unvisited
MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))}  tests/test-coroutine -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-coroutine" 
==7882==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases!
==7882==WARNING: ASan is ignoring requested __asan_handle_no_return: stack top: 0x7fffe903e000; bottom 0x7fdb692f8000; size: 0x00247fd46000 (156763447296)
False positive error reports may follow
For details see https://github.com/google/sanitizers/issues/189
PASS 1 test-coroutine /basic/no-dangling-access
---
PASS 13 test-aio /aio/event/wait/no-flush-cb
PASS 12 fdc-test /x86_64/fdc/read_no_dma_19
PASS 13 fdc-test /x86_64/fdc/fuzz-registers
==7901==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases!
PASS 14 test-aio /aio/timer/schedule
PASS 15 test-aio /aio/coroutine/queue-chaining
PASS 16 test-aio /aio-gsource/flush
---
PASS 26 test-aio /aio-gsource/event/flush
PASS 27 test-aio /aio-gsource/event/wait/no-flush-cb
MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))}  QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64 QTEST_QEMU_IMG=qemu-img tests/ide-test -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="ide-test" 
==7910==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases!
PASS 28 test-aio /aio-gsource/timer/schedule
PASS 1 ide-test /x86_64/ide/identify
MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))}  tests/test-aio-multithread -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-aio-multithread" 
==7917==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases!
==7919==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases!
PASS 1 test-aio-multithread /aio/multi/lifecycle
PASS 2 ide-test /x86_64/ide/flush
PASS 2 test-aio-multithread /aio/multi/schedule
==7937==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases!
PASS 3 ide-test /x86_64/ide/bmdma/simple_rw
PASS 3 test-aio-multithread /aio/multi/mutex/contended
==7948==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases!
PASS 4 ide-test /x86_64/ide/bmdma/trim
==7959==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases!
PASS 5 ide-test /x86_64/ide/bmdma/short_prdt
==7965==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases!
PASS 4 test-aio-multithread /aio/multi/mutex/handoff
PASS 6 ide-test /x86_64/ide/bmdma/one_sector_short_prdt
PASS 5 test-aio-multithread /aio/multi/mutex/mcs
==7976==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases!
PASS 7 ide-test /x86_64/ide/bmdma/long_prdt
PASS 6 test-aio-multithread /aio/multi/mutex/pthread
MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))}  tests/test-throttle -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-throttle" 
==7988==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases!
==7994==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases!
PASS 1 test-throttle /throttle/leak_bucket
PASS 2 test-throttle /throttle/compute_wait
PASS 3 test-throttle /throttle/init
---
PASS 13 test-throttle /throttle/config/ranges
PASS 14 test-throttle /throttle/config/max
PASS 15 test-throttle /throttle/config/iops_size
==7988==WARNING: ASan is ignoring requested __asan_handle_no_return: stack top: 0x7ffdab354000; bottom 0x7fab6c9aa000; size: 0x00523e9aa000 (353237639168)
False positive error reports may follow
For details see https://github.com/google/sanitizers/issues/189
MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))}  tests/test-thread-pool -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-thread-pool" 
PASS 8 ide-test /x86_64/ide/bmdma/no_busmaster
==8001==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases!
PASS 1 test-thread-pool /thread-pool/submit
PASS 2 test-thread-pool /thread-pool/submit-aio
PASS 3 test-thread-pool /thread-pool/submit-co
PASS 4 test-thread-pool /thread-pool/submit-many
PASS 9 ide-test /x86_64/ide/flush/nodev
PASS 5 test-thread-pool /thread-pool/cancel
==8073==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases!
PASS 10 ide-test /x86_64/ide/flush/empty_drive
==8078==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases!
**
ERROR:/tmp/qemu-test/src/tests/ide-test.c:754:test_retry_flush: assertion failed ((data) & (BSY | DRDY) == (BSY | DRDY)): (0x00000040 == 0x000000c0)
ERROR - Bail out! ERROR:/tmp/qemu-test/src/tests/ide-test.c:754:test_retry_flush: assertion failed ((data) & (BSY | DRDY) == (BSY | DRDY)): (0x00000040 == 0x000000c0)
PASS 6 test-thread-pool /thread-pool/cancel-async
make: *** [/tmp/qemu-test/src/tests/Makefile.include:899: check-qtest-x86_64] Error 1
make: *** Waiting for unfinished jobs....
---
PASS 42 test-hbitmap /hbitmap/next_dirty_area/next_dirty_area_1
PASS 43 test-hbitmap /hbitmap/next_dirty_area/next_dirty_area_4
MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))}  tests/test-bdrv-drain -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-bdrv-drain" 
==8091==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases!
PASS 1 test-bdrv-drain /bdrv-drain/nested
PASS 2 test-bdrv-drain /bdrv-drain/multiparent
PASS 3 test-bdrv-drain /bdrv-drain/set_aio_context
---
PASS 38 test-bdrv-drain /bdrv-drain/detach/driver_cb
PASS 39 test-bdrv-drain /bdrv-drain/attach/drain
MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))}  tests/test-bdrv-graph-mod -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-bdrv-graph-mod" 
==8131==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases!
PASS 1 test-bdrv-graph-mod /bdrv-graph-mod/update-perm-tree
PASS 2 test-bdrv-graph-mod /bdrv-graph-mod/should-update-child
MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))}  tests/test-blockjob -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-blockjob" 
==8136==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases!
PASS 1 test-blockjob /blockjob/ids
PASS 2 test-blockjob /blockjob/cancel/created
PASS 3 test-blockjob /blockjob/cancel/running
---
PASS 7 test-blockjob /blockjob/cancel/pending
PASS 8 test-blockjob /blockjob/cancel/concluded
MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))}  tests/test-blockjob-txn -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-blockjob-txn" 
==8141==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases!
PASS 1 test-blockjob-txn /single/success
PASS 2 test-blockjob-txn /single/failure
PASS 3 test-blockjob-txn /single/cancel
---
PASS 6 test-blockjob-txn /pair/cancel
PASS 7 test-blockjob-txn /pair/fail-cancel-race
MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))}  tests/test-block-backend -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-block-backend" 
==8146==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases!
PASS 1 test-block-backend /block-backend/drain_aio_error
PASS 2 test-block-backend /block-backend/drain_all_aio_error
MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))}  tests/test-block-iothread -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-block-iothread" 
==8151==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases!
PASS 1 test-block-iothread /sync-op/pread
PASS 2 test-block-iothread /sync-op/pwrite
PASS 3 test-block-iothread /sync-op/load_vmstate
---
PASS 15 test-block-iothread /propagate/diamond
PASS 16 test-block-iothread /propagate/mirror
MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))}  tests/test-image-locking -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-image-locking" 
==8172==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases!
PASS 1 test-image-locking /image-locking/basic
PASS 2 test-image-locking /image-locking/set-perm-abort
MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))}  tests/test-x86-cpuid -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-x86-cpuid" 
---
PASS 1 test-logging /logging/parse_range
PASS 2 test-logging /logging/parse_path
MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))}  tests/test-replication -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-replication" 
==8589==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases!
PASS 1 test-replication /replication/primary/read
PASS 2 test-replication /replication/primary/write
PASS 3 test-replication /replication/primary/start
---
PASS 6 test-replication /replication/primary/get_error_all
PASS 7 test-replication /replication/secondary/read
PASS 8 test-replication /replication/secondary/write
==8589==WARNING: ASan is ignoring requested __asan_handle_no_return: stack top: 0x7ffe5e2c9000; bottom 0x7f6159afc000; size: 0x009d047cd000 (674385154048)
False positive error reports may follow
For details see https://github.com/google/sanitizers/issues/189
PASS 9 test-replication /replication/secondary/start


The full log is available at
http://patchew.org/logs/1563261042-15974-1-git-send-email-yan.y.zhao@intel.com/testing.asan/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com
Dr. David Alan Gilbert July 25, 2019, 10:39 a.m. UTC | #8
* Yan Zhao (yan.y.zhao@intel.com) wrote:
> for some devices to do live migration, it is needed to do something
> immediately before vcpu stops. add a notification here.
> 
> Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
> ---
>  cpus.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/cpus.c b/cpus.c
> index b09b702..d5d4abe 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -1068,6 +1068,7 @@ static int do_vm_stop(RunState state, bool send_stop)
>      int ret = 0;
>  
>      if (runstate_is_running()) {
> +        vm_state_notify(1, state);

SO that's quite interesting in that you'll end up getting a
notificatiion like 'running=true, state=RUN_STATE_SHUTDOWN'
that might be unexpected by existing callers.

Have you checked existing callers?  Also does this cause another event
to be sent on the QMP - if so we need to chekc if this would confuse
libvirt.

Dave

>          cpu_disable_ticks();
>          pause_all_vcpus();
>          runstate_set(state);
> -- 
> 2.7.4
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
Yan Zhao July 26, 2019, 12:12 a.m. UTC | #9
On Thu, Jul 25, 2019 at 06:39:07PM +0800, Dr. David Alan Gilbert wrote:
> * Yan Zhao (yan.y.zhao@intel.com) wrote:
> > for some devices to do live migration, it is needed to do something
> > immediately before vcpu stops. add a notification here.
> > 
> > Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
> > ---
> >  cpus.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/cpus.c b/cpus.c
> > index b09b702..d5d4abe 100644
> > --- a/cpus.c
> > +++ b/cpus.c
> > @@ -1068,6 +1068,7 @@ static int do_vm_stop(RunState state, bool send_stop)
> >      int ret = 0;
> >  
> >      if (runstate_is_running()) {
> > +        vm_state_notify(1, state);
> 
> SO that's quite interesting in that you'll end up getting a
> notificatiion like 'running=true, state=RUN_STATE_SHUTDOWN'
> that might be unexpected by existing callers.
> 
> Have you checked existing callers?  Also does this cause another event
> to be sent on the QMP - if so we need to chekc if this would confuse
> libvirt.
> 
> Dave

hi Dave
yes, this may cause problem for existing handlers as this is an
unexpected condition. like for ide's ide_restart_cb.
So, do you think it's a better that do the notification earlier, before
vm_stop_force_state() in migration.c and call notifier_list_notify(&migration_state_notifiers, s)
to notify migration state instead ?

Thanks
Yan



> 
> >          cpu_disable_ticks();
> >          pause_all_vcpus();
> >          runstate_set(state);
> > -- 
> > 2.7.4
> > 
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
diff mbox series

Patch

diff --git a/cpus.c b/cpus.c
index b09b702..d5d4abe 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1068,6 +1068,7 @@  static int do_vm_stop(RunState state, bool send_stop)
     int ret = 0;
 
     if (runstate_is_running()) {
+        vm_state_notify(1, state);
         cpu_disable_ticks();
         pause_all_vcpus();
         runstate_set(state);