mbox series

[v5,00/21] block: remove aio_disable_external() API

Message ID 20230504195327.695107-1-stefanha@redhat.com (mailing list archive)
Headers show
Series block: remove aio_disable_external() API | expand

Message

Stefan Hajnoczi May 4, 2023, 7:53 p.m. UTC
v5:
- Use atomic accesses for in_flight counter in vhost-user-server.c [Kevin]
- Stash SCSIDevice id/lun values for VIRTIO_SCSI_T_TRANSPORT_RESET event
  before unrealizing the SCSIDevice [Kevin]
- Keep vhost-user-blk export .detach() callback so ctx is set to NULL [Kevin]
- Narrow BdrvChildClass and BlockDriver drained_{begin/end/poll} callbacks from
  IO_OR_GS_CODE() to GLOBAL_STATE_CODE() [Kevin]
- Include Kevin's "block: Fix use after free in blockdev_mark_auto_del()" to
  fix a latent bug that was exposed by this series

v4:
- Remove external_disable_cnt variable [Philippe]
- Add Patch 1 to fix assertion failure in .drained_end() -> blk_get_aio_context()
v3:
- Resend full patch series. v2 was sent in the middle of a git rebase and was
  missing patches. [Eric]
- Apply Reviewed-by tags.
v2:
- Do not rely on BlockBackend request queuing, implement .drained_begin/end()
  instead in xen-block, virtio-blk, and virtio-scsi [Paolo]
- Add qdev_is_realized() API [Philippe]
- Add patch to avoid AioContext lock around blk_exp_ref/unref() [Paolo]
- Add patch to call .drained_begin/end() from main loop thread to simplify
  callback implementations

The aio_disable_external() API temporarily suspends file descriptor monitoring
in the event loop. The block layer uses this to prevent new I/O requests being
submitted from the guest and elsewhere between bdrv_drained_begin() and
bdrv_drained_end().

While the block layer still needs to prevent new I/O requests in drained
sections, the aio_disable_external() API can be replaced with
.drained_begin/end/poll() callbacks that have been added to BdrvChildClass and
BlockDevOps.

This newer .bdrained_begin/end/poll() approach is attractive because it works
without specifying a specific AioContext. The block layer is moving towards
multi-queue and that means multiple AioContexts may be processing I/O
simultaneously.

The aio_disable_external() was always somewhat hacky. It suspends all file
descriptors that were registered with is_external=true, even if they have
nothing to do with the BlockDriverState graph nodes that are being drained.
It's better to solve a block layer problem in the block layer than to have an
odd event loop API solution.

The approach in this patch series is to implement BlockDevOps
.drained_begin/end() callbacks that temporarily stop file descriptor handlers.
This ensures that new I/O requests are not submitted in drained sections.

Kevin Wolf (1):
  block: Fix use after free in blockdev_mark_auto_del()

Stefan Hajnoczi (20):
  block-backend: split blk_do_set_aio_context()
  hw/qdev: introduce qdev_is_realized() helper
  virtio-scsi: avoid race between unplug and transport event
  virtio-scsi: stop using aio_disable_external() during unplug
  util/vhost-user-server: rename refcount to in_flight counter
  block/export: wait for vhost-user-blk requests when draining
  block/export: stop using is_external in vhost-user-blk server
  hw/xen: do not use aio_set_fd_handler(is_external=true) in
    xen_xenstore
  block: add blk_in_drain() API
  block: drain from main loop thread in bdrv_co_yield_to_drain()
  xen-block: implement BlockDevOps->drained_begin()
  hw/xen: do not set is_external=true on evtchn fds
  block/export: rewrite vduse-blk drain code
  block/export: don't require AioContext lock around blk_exp_ref/unref()
  block/fuse: do not set is_external=true on FUSE fd
  virtio: make it possible to detach host notifier from any thread
  virtio-blk: implement BlockDevOps->drained_begin()
  virtio-scsi: implement BlockDevOps->drained_begin()
  virtio: do not set is_external=true on host notifiers
  aio: remove aio_disable_external() API

 hw/block/dataplane/xen-block.h              |   2 +
 include/block/aio.h                         |  57 ---------
 include/block/block_int-common.h            |  90 +++++++-------
 include/block/export.h                      |   2 +
 include/hw/qdev-core.h                      |  17 ++-
 include/hw/scsi/scsi.h                      |  14 +++
 include/qemu/vhost-user-server.h            |   8 +-
 include/sysemu/block-backend-common.h       |  25 ++--
 include/sysemu/block-backend-global-state.h |   1 +
 util/aio-posix.h                            |   1 -
 block.c                                     |   7 --
 block/blkio.c                               |  15 +--
 block/block-backend.c                       |  78 ++++++------
 block/curl.c                                |  10 +-
 block/export/export.c                       |  13 +-
 block/export/fuse.c                         |  56 ++++++++-
 block/export/vduse-blk.c                    | 128 ++++++++++++++------
 block/export/vhost-user-blk-server.c        |  52 +++++++-
 block/io.c                                  |  16 ++-
 block/io_uring.c                            |   4 +-
 block/iscsi.c                               |   3 +-
 block/linux-aio.c                           |   4 +-
 block/nfs.c                                 |   5 +-
 block/nvme.c                                |   8 +-
 block/ssh.c                                 |   4 +-
 block/win32-aio.c                           |   6 +-
 blockdev.c                                  |  18 ++-
 hw/block/dataplane/virtio-blk.c             |  19 ++-
 hw/block/dataplane/xen-block.c              |  42 +++++--
 hw/block/virtio-blk.c                       |  38 +++++-
 hw/block/xen-block.c                        |  24 +++-
 hw/i386/kvm/xen_xenstore.c                  |   2 +-
 hw/scsi/scsi-bus.c                          |  46 ++++++-
 hw/scsi/scsi-disk.c                         |  27 ++++-
 hw/scsi/virtio-scsi-dataplane.c             |  31 +++--
 hw/scsi/virtio-scsi.c                       | 127 ++++++++++++++-----
 hw/virtio/virtio.c                          |   6 +-
 hw/xen/xen-bus.c                            |  11 +-
 io/channel-command.c                        |   6 +-
 io/channel-file.c                           |   3 +-
 io/channel-socket.c                         |   3 +-
 migration/rdma.c                            |  16 +--
 tests/unit/test-aio.c                       |  27 +----
 tests/unit/test-bdrv-drain.c                |  15 +--
 tests/unit/test-fdmon-epoll.c               |  73 -----------
 util/aio-posix.c                            |  20 +--
 util/aio-win32.c                            |   8 +-
 util/async.c                                |   3 +-
 util/fdmon-epoll.c                          |  18 +--
 util/fdmon-io_uring.c                       |   8 +-
 util/fdmon-poll.c                           |   3 +-
 util/main-loop.c                            |   7 +-
 util/qemu-coroutine-io.c                    |   7 +-
 util/vhost-user-server.c                    |  33 ++---
 hw/scsi/trace-events                        |   2 +
 tests/unit/meson.build                      |   3 -
 56 files changed, 738 insertions(+), 534 deletions(-)
 delete mode 100644 tests/unit/test-fdmon-epoll.c

Comments

Kevin Wolf May 4, 2023, 9:44 p.m. UTC | #1
Am 04.05.2023 um 21:53 hat Stefan Hajnoczi geschrieben:
> v5:
> - Use atomic accesses for in_flight counter in vhost-user-server.c [Kevin]
> - Stash SCSIDevice id/lun values for VIRTIO_SCSI_T_TRANSPORT_RESET event
>   before unrealizing the SCSIDevice [Kevin]
> - Keep vhost-user-blk export .detach() callback so ctx is set to NULL [Kevin]
> - Narrow BdrvChildClass and BlockDriver drained_{begin/end/poll} callbacks from
>   IO_OR_GS_CODE() to GLOBAL_STATE_CODE() [Kevin]
> - Include Kevin's "block: Fix use after free in blockdev_mark_auto_del()" to
>   fix a latent bug that was exposed by this series

I only just finished reviewing v4 when you had already sent v5, but it
hadn't arrived yet. I had a few more comments on what are now patches
17, 18, 19 and 21 in v5. I think they all still apply.

Kevin
Stefan Hajnoczi May 9, 2023, 5:51 p.m. UTC | #2
On Thu, May 04, 2023 at 11:44:42PM +0200, Kevin Wolf wrote:
> Am 04.05.2023 um 21:53 hat Stefan Hajnoczi geschrieben:
> > v5:
> > - Use atomic accesses for in_flight counter in vhost-user-server.c [Kevin]
> > - Stash SCSIDevice id/lun values for VIRTIO_SCSI_T_TRANSPORT_RESET event
> >   before unrealizing the SCSIDevice [Kevin]
> > - Keep vhost-user-blk export .detach() callback so ctx is set to NULL [Kevin]
> > - Narrow BdrvChildClass and BlockDriver drained_{begin/end/poll} callbacks from
> >   IO_OR_GS_CODE() to GLOBAL_STATE_CODE() [Kevin]
> > - Include Kevin's "block: Fix use after free in blockdev_mark_auto_del()" to
> >   fix a latent bug that was exposed by this series
> 
> I only just finished reviewing v4 when you had already sent v5, but it
> hadn't arrived yet. I had a few more comments on what are now patches
> 17, 18, 19 and 21 in v5. I think they all still apply.

I'm not sure which comments from v4 still apply. In my email client all
your replies were already read when I sent v5.

Maybe you can share the Message-Id of something I still need to address?

Thanks,
Stefan
Kevin Wolf May 9, 2023, 6:35 p.m. UTC | #3
Am 09.05.2023 um 19:51 hat Stefan Hajnoczi geschrieben:
> On Thu, May 04, 2023 at 11:44:42PM +0200, Kevin Wolf wrote:
> > Am 04.05.2023 um 21:53 hat Stefan Hajnoczi geschrieben:
> > > v5:
> > > - Use atomic accesses for in_flight counter in vhost-user-server.c [Kevin]
> > > - Stash SCSIDevice id/lun values for VIRTIO_SCSI_T_TRANSPORT_RESET event
> > >   before unrealizing the SCSIDevice [Kevin]
> > > - Keep vhost-user-blk export .detach() callback so ctx is set to NULL [Kevin]
> > > - Narrow BdrvChildClass and BlockDriver drained_{begin/end/poll} callbacks from
> > >   IO_OR_GS_CODE() to GLOBAL_STATE_CODE() [Kevin]
> > > - Include Kevin's "block: Fix use after free in blockdev_mark_auto_del()" to
> > >   fix a latent bug that was exposed by this series
> > 
> > I only just finished reviewing v4 when you had already sent v5, but it
> > hadn't arrived yet. I had a few more comments on what are now patches
> > 17, 18, 19 and 21 in v5. I think they all still apply.
> 
> I'm not sure which comments from v4 still apply. In my email client all
> your replies were already read when I sent v5.

Yes, but I added some more replies after you had sent v5 (and before I
fetched mail again to actually see v5).

> Maybe you can share the Message-Id of something I still need to address?

I thought the patch numbers identified them and were easier, but sure:

Message-ID: <ZFQc89cFJuoGF+qI@redhat.com>
Message-ID: <ZFQgBvWShB4NCymj@redhat.com>
Message-ID: <ZFQivbkVPcX3nECA@redhat.com>
Message-ID: <ZFQk2TdhZ6DiwM4t@redhat.com>

Kevin
Kevin Wolf May 9, 2023, 7:07 p.m. UTC | #4
Am 04.05.2023 um 21:53 hat Stefan Hajnoczi geschrieben:
> v5:
> - Use atomic accesses for in_flight counter in vhost-user-server.c [Kevin]
> - Stash SCSIDevice id/lun values for VIRTIO_SCSI_T_TRANSPORT_RESET event
>   before unrealizing the SCSIDevice [Kevin]
> - Keep vhost-user-blk export .detach() callback so ctx is set to NULL [Kevin]
> - Narrow BdrvChildClass and BlockDriver drained_{begin/end/poll} callbacks from
>   IO_OR_GS_CODE() to GLOBAL_STATE_CODE() [Kevin]
> - Include Kevin's "block: Fix use after free in blockdev_mark_auto_del()" to
>   fix a latent bug that was exposed by this series
> 
> v4:
> - Remove external_disable_cnt variable [Philippe]
> - Add Patch 1 to fix assertion failure in .drained_end() -> blk_get_aio_context()
> v3:
> - Resend full patch series. v2 was sent in the middle of a git rebase and was
>   missing patches. [Eric]
> - Apply Reviewed-by tags.
> v2:
> - Do not rely on BlockBackend request queuing, implement .drained_begin/end()
>   instead in xen-block, virtio-blk, and virtio-scsi [Paolo]
> - Add qdev_is_realized() API [Philippe]
> - Add patch to avoid AioContext lock around blk_exp_ref/unref() [Paolo]
> - Add patch to call .drained_begin/end() from main loop thread to simplify
>   callback implementations
> 
> The aio_disable_external() API temporarily suspends file descriptor monitoring
> in the event loop. The block layer uses this to prevent new I/O requests being
> submitted from the guest and elsewhere between bdrv_drained_begin() and
> bdrv_drained_end().
> 
> While the block layer still needs to prevent new I/O requests in drained
> sections, the aio_disable_external() API can be replaced with
> .drained_begin/end/poll() callbacks that have been added to BdrvChildClass and
> BlockDevOps.
> 
> This newer .bdrained_begin/end/poll() approach is attractive because it works
> without specifying a specific AioContext. The block layer is moving towards
> multi-queue and that means multiple AioContexts may be processing I/O
> simultaneously.
> 
> The aio_disable_external() was always somewhat hacky. It suspends all file
> descriptors that were registered with is_external=true, even if they have
> nothing to do with the BlockDriverState graph nodes that are being drained.
> It's better to solve a block layer problem in the block layer than to have an
> odd event loop API solution.
> 
> The approach in this patch series is to implement BlockDevOps
> .drained_begin/end() callbacks that temporarily stop file descriptor handlers.
> This ensures that new I/O requests are not submitted in drained sections.

Patches 2-16: Reviewed-by: Kevin Wolf <kwolf@redhat.com>