mbox series

[v7,00/13] monitor: Optionally run handlers in coroutines

Message ID 20200909151149.490589-1-kwolf@redhat.com
Headers show
Series monitor: Optionally run handlers in coroutines | expand

Message

Kevin Wolf Sept. 9, 2020, 3:11 p.m. UTC
Some QMP command handlers can block the main loop for a relatively long
time, for example because they perform some I/O. This is quite nasty.
Allowing such handlers to run in a coroutine where they can yield (and
therefore release the BQL) while waiting for an event such as I/O
completion solves the problem.

This series adds the infrastructure to allow this and switches
block_resize to run in a coroutine as a first example.

This is an alternative solution to Marc-André's "monitor: add
asynchronous command type" series.

v7:
- Added patch 2 to add a Monitor parameter to monitor_get_cpu_index(),
  too [Markus]
- Let monitor_set_cur() return the old monitor [Markus]
- Fixed comment about linking stub objects in test-util-sockets [Markus]
- More detailed commit message for per-coroutine current monitor and
  document that monitor_set_cur(NULL) must be called eventually [Markus]
- Resolve some merge conflicts on rebase

v6:
- Fixed cur_mon behaviour: It should always point to the Monitor while
  we're in the handler coroutine, but be NULL while the handler
  coroutines has yielded. [Markus]
- Give HMP handlers the coroutine option, too, because they will call
  QMP handlers, and life is easier when we know whether we are in
  coroutine context or not.
- Fixed block_resize for block devices in iothreads and for HMP
- Resolved some merge conflict with QAPI generator and monitor
  refactorings that were merged in the meantime

v5:
- Improved comments and documentation [Markus]

v4:
- Forbid 'oob': true, 'coroutine': true [Markus]
- Removed Python type hints [Markus]
- Introduced separate bool qmp_dispatcher_co_shutdown to make it clearer
  how a shutdown request is signalled to the dispatcher [Markus]
- Allow using aio_poll() with iohandler_ctx and use that instead of
  aio_bh_poll() [Markus]
- Removed coroutine_fn from qmp_block_resize() again because at least
  one caller (HMP) calls it outside of coroutine context [Markus]
- Tried to make the synchronisation between dispatcher and the monitor
  thread clearer, and fixed a race condition
- Improved documentation and comments

v3:
- Fix race between monitor thread and dispatcher that could schedule the
  dispatcher coroutine twice if a second requests comes in before the
  dispatcher can wake up [Patchew]

v2:
- Fix typo in a commit message [Eric]
- Use hyphen instead of underscore for the test command [Eric]
- Mark qmp_block_resize() as coroutine_fn [Stefan]


Kevin Wolf (13):
  monitor: Add Monitor parameter to monitor_set_cpu()
  monitor: Add Monitor parameter to monitor_get_cpu_index()
  monitor: Use getter/setter functions for cur_mon
  hmp: Update current monitor only in handle_hmp_command()
  qmp: Assert that no other monitor is active
  qmp: Call monitor_set_cur() only in qmp_dispatch()
  monitor: Make current monitor a per-coroutine property
  qapi: Add a 'coroutine' flag for commands
  qmp: Move dispatcher to a coroutine
  hmp: Add support for coroutine command handlers
  util/async: Add aio_co_reschedule_self()
  block: Add bdrv_co_move_to_aio_context()
  block: Convert 'block_resize' to coroutine

 qapi/block-core.json                    |   3 +-
 docs/devel/qapi-code-gen.txt            |  12 +++
 include/block/aio.h                     |  10 ++
 include/block/block.h                   |   6 ++
 include/monitor/monitor.h               |   7 +-
 include/qapi/qmp/dispatch.h             |   5 +-
 monitor/monitor-internal.h              |   7 +-
 audio/wavcapture.c                      |   8 +-
 block.c                                 |  10 ++
 blockdev.c                              |  13 ++-
 dump/dump.c                             |   2 +-
 hw/core/machine-hmp-cmds.c              |   2 +-
 hw/scsi/vhost-scsi.c                    |   2 +-
 hw/virtio/vhost-vsock.c                 |   2 +-
 migration/fd.c                          |   4 +-
 monitor/hmp-cmds.c                      |   4 +-
 monitor/hmp.c                           |  44 ++++++--
 monitor/misc.c                          |  38 ++++---
 monitor/monitor.c                       | 101 ++++++++++++++++--
 monitor/qmp-cmds-control.c              |   2 +
 monitor/qmp-cmds.c                      |   2 +-
 monitor/qmp.c                           | 130 +++++++++++++++++-------
 net/socket.c                            |   2 +-
 net/tap.c                               |   6 +-
 qapi/qmp-dispatch.c                     |  61 ++++++++++-
 qapi/qmp-registry.c                     |   3 +
 qga/main.c                              |   2 +-
 softmmu/cpus.c                          |   2 +-
 stubs/monitor-core.c                    |  10 +-
 tests/test-qmp-cmds.c                   |  10 +-
 tests/test-util-sockets.c               |  12 +--
 trace/control.c                         |   2 +-
 util/aio-posix.c                        |   8 +-
 util/async.c                            |  30 ++++++
 util/qemu-error.c                       |   6 +-
 util/qemu-print.c                       |   3 +-
 util/qemu-sockets.c                     |   1 +
 scripts/qapi/commands.py                |  10 +-
 scripts/qapi/doc.py                     |   2 +-
 scripts/qapi/expr.py                    |  10 +-
 scripts/qapi/introspect.py              |   2 +-
 scripts/qapi/schema.py                  |  12 ++-
 tests/qapi-schema/test-qapi.py          |   7 +-
 hmp-commands.hx                         |   1 +
 tests/qapi-schema/meson.build           |   1 +
 tests/qapi-schema/oob-coroutine.err     |   2 +
 tests/qapi-schema/oob-coroutine.json    |   2 +
 tests/qapi-schema/oob-coroutine.out     |   0
 tests/qapi-schema/qapi-schema-test.json |   1 +
 tests/qapi-schema/qapi-schema-test.out  |   2 +
 50 files changed, 481 insertions(+), 143 deletions(-)
 create mode 100644 tests/qapi-schema/oob-coroutine.err
 create mode 100644 tests/qapi-schema/oob-coroutine.json
 create mode 100644 tests/qapi-schema/oob-coroutine.out

Comments

no-reply@patchew.org Sept. 9, 2020, 3:24 p.m. UTC | #1
Patchew URL: https://patchew.org/QEMU/20200909151149.490589-1-kwolf@redhat.com/



Hi,

This series failed build test on FreeBSD host. Please find the details below.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
# Testing script will be invoked under the git checkout with
# HEAD pointing to a commit that has the patches applied on top of "base"
# branch
if qemu-system-x86_64 --help >/dev/null 2>&1; then
  QEMU=qemu-system-x86_64
elif /usr/libexec/qemu-kvm --help >/dev/null 2>&1; then
  QEMU=/usr/libexec/qemu-kvm
else
  exit 1
fi
make vm-build-freebsd J=21 QEMU=$QEMU
exit 0
=== TEST SCRIPT END ===




The full log is available at
http://patchew.org/logs/20200909151149.490589-1-kwolf@redhat.com/testing.FreeBSD/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com
Stefan Hajnoczi Sept. 10, 2020, 1:24 p.m. UTC | #2
On Wed, Sep 09, 2020 at 05:11:36PM +0200, Kevin Wolf wrote:
> Some QMP command handlers can block the main loop for a relatively long
> time, for example because they perform some I/O. This is quite nasty.
> Allowing such handlers to run in a coroutine where they can yield (and
> therefore release the BQL) while waiting for an event such as I/O
> completion solves the problem.
> 
> This series adds the infrastructure to allow this and switches
> block_resize to run in a coroutine as a first example.
> 
> This is an alternative solution to Marc-André's "monitor: add
> asynchronous command type" series.

Please clarify the following in the QAPI documentation:
 * Is the QMP monitor suspended while the command is pending?
 * Are QMP events reported while the command is pending?

Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Markus Armbruster Sept. 14, 2020, 3:09 p.m. UTC | #3
Stefan Hajnoczi <stefanha@redhat.com> writes:

> On Wed, Sep 09, 2020 at 05:11:36PM +0200, Kevin Wolf wrote:
>> Some QMP command handlers can block the main loop for a relatively long
>> time, for example because they perform some I/O. This is quite nasty.
>> Allowing such handlers to run in a coroutine where they can yield (and
>> therefore release the BQL) while waiting for an event such as I/O
>> completion solves the problem.
>> 
>> This series adds the infrastructure to allow this and switches
>> block_resize to run in a coroutine as a first example.
>> 
>> This is an alternative solution to Marc-André's "monitor: add
>> asynchronous command type" series.
>
> Please clarify the following in the QAPI documentation:
>  * Is the QMP monitor suspended while the command is pending?
>  * Are QMP events reported while the command is pending?

Good points.  Kevin, I'd be willing to take this as a follow-up patch,
if that's more convenient for you.

> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>

Stefan, I could use your proper review of PATCH 11-13.  Pretty-please?
Stefan Hajnoczi Sept. 15, 2020, 2:58 p.m. UTC | #4
On Mon, Sep 14, 2020 at 05:09:49PM +0200, Markus Armbruster wrote:
> Stefan Hajnoczi <stefanha@redhat.com> writes:
> 
> > On Wed, Sep 09, 2020 at 05:11:36PM +0200, Kevin Wolf wrote:
> >> Some QMP command handlers can block the main loop for a relatively long
> >> time, for example because they perform some I/O. This is quite nasty.
> >> Allowing such handlers to run in a coroutine where they can yield (and
> >> therefore release the BQL) while waiting for an event such as I/O
> >> completion solves the problem.
> >> 
> >> This series adds the infrastructure to allow this and switches
> >> block_resize to run in a coroutine as a first example.
> >> 
> >> This is an alternative solution to Marc-André's "monitor: add
> >> asynchronous command type" series.
> >
> > Please clarify the following in the QAPI documentation:
> >  * Is the QMP monitor suspended while the command is pending?
> >  * Are QMP events reported while the command is pending?
> 
> Good points.  Kevin, I'd be willing to take this as a follow-up patch,
> if that's more convenient for you.
> 
> > Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
> 
> Stefan, I could use your proper review of PATCH 11-13.  Pretty-please?

Sounds good. I have reviewed the patches 11-13 and left questions for
Kevin.
Kevin Wolf Sept. 25, 2020, 5:15 p.m. UTC | #5
Am 10.09.2020 um 15:24 hat Stefan Hajnoczi geschrieben:
> On Wed, Sep 09, 2020 at 05:11:36PM +0200, Kevin Wolf wrote:
> > Some QMP command handlers can block the main loop for a relatively long
> > time, for example because they perform some I/O. This is quite nasty.
> > Allowing such handlers to run in a coroutine where they can yield (and
> > therefore release the BQL) while waiting for an event such as I/O
> > completion solves the problem.
> > 
> > This series adds the infrastructure to allow this and switches
> > block_resize to run in a coroutine as a first example.
> > 
> > This is an alternative solution to Marc-André's "monitor: add
> > asynchronous command type" series.
> 
> Please clarify the following in the QAPI documentation:
>  * Is the QMP monitor suspended while the command is pending?

Suspended as in monitor_suspend()? No.

>  * Are QMP events reported while the command is pending?

Hm, I don't know to be honest. But I think so.

Does it matter, though? I don't think events have a defined order
compared to command results, and the client can't respond to the event
anyway until the current command has completed.

Kevin
Stefan Hajnoczi Sept. 28, 2020, 8:46 a.m. UTC | #6
On Fri, Sep 25, 2020 at 07:15:41PM +0200, Kevin Wolf wrote:
> Am 10.09.2020 um 15:24 hat Stefan Hajnoczi geschrieben:
> > On Wed, Sep 09, 2020 at 05:11:36PM +0200, Kevin Wolf wrote:
> > > Some QMP command handlers can block the main loop for a relatively long
> > > time, for example because they perform some I/O. This is quite nasty.
> > > Allowing such handlers to run in a coroutine where they can yield (and
> > > therefore release the BQL) while waiting for an event such as I/O
> > > completion solves the problem.
> > > 
> > > This series adds the infrastructure to allow this and switches
> > > block_resize to run in a coroutine as a first example.
> > > 
> > > This is an alternative solution to Marc-André's "monitor: add
> > > asynchronous command type" series.
> > 
> > Please clarify the following in the QAPI documentation:
> >  * Is the QMP monitor suspended while the command is pending?
> 
> Suspended as in monitor_suspend()? No.
> 
> >  * Are QMP events reported while the command is pending?
> 
> Hm, I don't know to be honest. But I think so.
> 
> Does it matter, though? I don't think events have a defined order
> compared to command results, and the client can't respond to the event
> anyway until the current command has completed.

You're right, I don't think it matters because the client must expect
QMP events at any time.

I was trying to understand the semantics of coroutine monitor commands
from two perspectives:

1. The QMP client - do coroutine commands behave differently from
   non-coroutine commands? I think the answer is no. The monitor will
   not process more commands until the coroutine finishes?

2. The command implementation - which thread does the coroutine run in?
   I guess it runs in the main loop thread with the BQL and the
   AioContext acquired?
Markus Armbruster Sept. 28, 2020, 9:30 a.m. UTC | #7
Kevin Wolf <kwolf@redhat.com> writes:

> Am 10.09.2020 um 15:24 hat Stefan Hajnoczi geschrieben:
>> On Wed, Sep 09, 2020 at 05:11:36PM +0200, Kevin Wolf wrote:
>> > Some QMP command handlers can block the main loop for a relatively long
>> > time, for example because they perform some I/O. This is quite nasty.
>> > Allowing such handlers to run in a coroutine where they can yield (and
>> > therefore release the BQL) while waiting for an event such as I/O
>> > completion solves the problem.
>> > 
>> > This series adds the infrastructure to allow this and switches
>> > block_resize to run in a coroutine as a first example.
>> > 
>> > This is an alternative solution to Marc-André's "monitor: add
>> > asynchronous command type" series.
>> 
>> Please clarify the following in the QAPI documentation:
>>  * Is the QMP monitor suspended while the command is pending?
>
> Suspended as in monitor_suspend()? No.

A suspended monitor doesn't read monitor input.

We suspend

* a QMP monitor while the request queue is full

* an HMP monitor while it executes a command

* a multiplexed HMP monitor while the "mux-focus" is elsewhere

* an HMP monitor when it executes command "quit", forever

* an HMP monitor while it executes command "migrate" without -d

Let me explain the first item in a bit more detail.  Before OOB, a QMP
monitor was also suspended while it executed a command.  To make OOB
work, we moved the QMP monitors to an I/O thread and added a request
queue, drained by the main loop.  QMP monitors continue to read
commands, execute right away if OOB, else queue, suspend when queue gets
full, resume when it gets non-full.

The "run command in coroutine context" feature does not affect any of
this.

qapi-code-gen.txt does not talk about monitor suspension at all.  It's
instead discussed in qmp-spec.txt section 2.3.1 Out-of-band execution.

Stefan, what would you like us to clarify, and where?

>>  * Are QMP events reported while the command is pending?
>
> Hm, I don't know to be honest. But I think so.

Yes, events should be reported while a command is being executed.

Sending events takes locks.  Their critical sections are all
short-lived.  Another possible delay is the underlying character device
failing the send with EAGAIN.  That's all.

Fine print: qapi_event_emit() takes @monitor_lock.  It sends to each QMP
monitor with qmp_send_response(), which uses monitor_puts(), which takes
the monitor's @mon_lock.

The "run command in coroutine context" feature does not affect any of
this.

> Does it matter, though? I don't think events have a defined order
> compared to command results, and the client can't respond to the event
> anyway until the current command has completed.

Stefan, what would you like us to clarify, and where?
Kevin Wolf Sept. 28, 2020, 9:47 a.m. UTC | #8
Am 28.09.2020 um 10:46 hat Stefan Hajnoczi geschrieben:
> On Fri, Sep 25, 2020 at 07:15:41PM +0200, Kevin Wolf wrote:
> > Am 10.09.2020 um 15:24 hat Stefan Hajnoczi geschrieben:
> > > On Wed, Sep 09, 2020 at 05:11:36PM +0200, Kevin Wolf wrote:
> > > > Some QMP command handlers can block the main loop for a relatively long
> > > > time, for example because they perform some I/O. This is quite nasty.
> > > > Allowing such handlers to run in a coroutine where they can yield (and
> > > > therefore release the BQL) while waiting for an event such as I/O
> > > > completion solves the problem.
> > > > 
> > > > This series adds the infrastructure to allow this and switches
> > > > block_resize to run in a coroutine as a first example.
> > > > 
> > > > This is an alternative solution to Marc-André's "monitor: add
> > > > asynchronous command type" series.
> > > 
> > > Please clarify the following in the QAPI documentation:
> > >  * Is the QMP monitor suspended while the command is pending?
> > 
> > Suspended as in monitor_suspend()? No.
> > 
> > >  * Are QMP events reported while the command is pending?
> > 
> > Hm, I don't know to be honest. But I think so.
> > 
> > Does it matter, though? I don't think events have a defined order
> > compared to command results, and the client can't respond to the event
> > anyway until the current command has completed.
> 
> You're right, I don't think it matters because the client must expect
> QMP events at any time.
> 
> I was trying to understand the semantics of coroutine monitor commands
> from two perspectives:
> 
> 1. The QMP client - do coroutine commands behave differently from
>    non-coroutine commands? I think the answer is no. The monitor will
>    not process more commands until the coroutine finishes?

No, on the wire, things should look exactly the same. If you consider
more than the QMP traffic and the client communicates with the guest, it
might see that the guest isn't blocked any more, of course.

> 2. The command implementation - which thread does the coroutine run in?
>    I guess it runs in the main loop thread with the BQL and the
>    AioContext acquired?

By default, yes. But the coroutine can reschedule itself to another
thread. Block-related handlers will want to reschedule themselves to
bs->aio_context because you can't just hold the AioContext lock from
another thread across yields. This is what block_resize does after this
series.

Kevin