mbox series

[v2,0/3] monitor: only run coroutine commands in qemu_aio_context

Message ID 20240118144823.1497953-1-stefanha@redhat.com (mailing list archive)
Headers show
Series monitor: only run coroutine commands in qemu_aio_context | expand

Message

Stefan Hajnoczi Jan. 18, 2024, 2:48 p.m. UTC
v2:
- Filter image format in 141 test output [Kevin]
- Fix pylint and mypy errors in 141 [Kevin]

Several bugs have been reported related to how QMP commands are rescheduled in
qemu_aio_context:
- https://gitlab.com/qemu-project/qemu/-/issues/1933
- https://issues.redhat.com/browse/RHEL-17369
- https://bugzilla.redhat.com/show_bug.cgi?id=2215192
- https://bugzilla.redhat.com/show_bug.cgi?id=2214985

The first instance of the bug interacted with drain_call_rcu() temporarily
dropping the BQL and resulted in vCPU threads entering device emulation code
simultaneously (something that should never happen). I set out to make
drain_call_rcu() safe to use in this environment, but Paolo and Kevin discussed
the possibility of avoiding rescheduling the monitor_qmp_dispatcher_co()
coroutine for non-coroutine commands. This would prevent monitor commands from
running during vCPU thread aio_poll() entirely and addresses the root cause.

This patch series implements this idea. qemu-iotests is sensitive to the exact
order in which QMP events and responses are emitted. Running QMP handlers in
the iohandler AioContext causes some QMP events to be ordered differently than
before. It is therefore necessary to adjust the reference output in many test
cases. The actual QMP code change is small and everything else is just to make
qemu-iotests happy.

If you have bugs related to the same issue, please retest them with these
patches. Thanks!

Stefan Hajnoczi (3):
  iotests: add filter_qmp_generated_node_ids()
  iotests: port 141 to Python for reliable QMP testing
  monitor: only run coroutine commands in qemu_aio_context

 monitor/qmp.c                                 |  17 -
 qapi/qmp-dispatch.c                           |  24 +-
 tests/qemu-iotests/060.out                    |   4 +-
 tests/qemu-iotests/071.out                    |   4 +-
 tests/qemu-iotests/081.out                    |  16 +-
 tests/qemu-iotests/087.out                    |  12 +-
 tests/qemu-iotests/108.out                    |   2 +-
 tests/qemu-iotests/109                        |   4 +-
 tests/qemu-iotests/109.out                    |  78 ++---
 tests/qemu-iotests/117.out                    |   2 +-
 tests/qemu-iotests/120.out                    |   2 +-
 tests/qemu-iotests/127.out                    |   2 +-
 tests/qemu-iotests/140.out                    |   2 +-
 tests/qemu-iotests/141                        | 307 ++++++++----------
 tests/qemu-iotests/141.out                    | 190 +++--------
 tests/qemu-iotests/143.out                    |   2 +-
 tests/qemu-iotests/156.out                    |   2 +-
 tests/qemu-iotests/176.out                    |  16 +-
 tests/qemu-iotests/182.out                    |   2 +-
 tests/qemu-iotests/183.out                    |   4 +-
 tests/qemu-iotests/184.out                    |  32 +-
 tests/qemu-iotests/185                        |   6 +-
 tests/qemu-iotests/185.out                    |  45 ++-
 tests/qemu-iotests/191.out                    |  16 +-
 tests/qemu-iotests/195.out                    |  16 +-
 tests/qemu-iotests/223.out                    |  16 +-
 tests/qemu-iotests/227.out                    |  32 +-
 tests/qemu-iotests/247.out                    |   2 +-
 tests/qemu-iotests/273.out                    |   8 +-
 tests/qemu-iotests/308                        |   4 +-
 tests/qemu-iotests/308.out                    |   4 +-
 tests/qemu-iotests/iotests.py                 |   7 +
 tests/qemu-iotests/tests/file-io-error        |   5 +-
 tests/qemu-iotests/tests/iothreads-resize.out |   2 +-
 tests/qemu-iotests/tests/qsd-jobs.out         |   4 +-
 35 files changed, 385 insertions(+), 506 deletions(-)

Comments

Kevin Wolf Jan. 18, 2024, 6:09 p.m. UTC | #1
Am 18.01.2024 um 15:48 hat Stefan Hajnoczi geschrieben:
> v2:
> - Filter image format in 141 test output [Kevin]
> - Fix pylint and mypy errors in 141 [Kevin]
> 
> Several bugs have been reported related to how QMP commands are rescheduled in
> qemu_aio_context:
> - https://gitlab.com/qemu-project/qemu/-/issues/1933
> - https://issues.redhat.com/browse/RHEL-17369
> - https://bugzilla.redhat.com/show_bug.cgi?id=2215192
> - https://bugzilla.redhat.com/show_bug.cgi?id=2214985
> 
> The first instance of the bug interacted with drain_call_rcu() temporarily
> dropping the BQL and resulted in vCPU threads entering device emulation code
> simultaneously (something that should never happen). I set out to make
> drain_call_rcu() safe to use in this environment, but Paolo and Kevin discussed
> the possibility of avoiding rescheduling the monitor_qmp_dispatcher_co()
> coroutine for non-coroutine commands. This would prevent monitor commands from
> running during vCPU thread aio_poll() entirely and addresses the root cause.
> 
> This patch series implements this idea. qemu-iotests is sensitive to the exact
> order in which QMP events and responses are emitted. Running QMP handlers in
> the iohandler AioContext causes some QMP events to be ordered differently than
> before. It is therefore necessary to adjust the reference output in many test
> cases. The actual QMP code change is small and everything else is just to make
> qemu-iotests happy.
> 
> If you have bugs related to the same issue, please retest them with these
> patches. Thanks!

Thanks, applied to the block branch.

Kevin