Message ID | 20200527102925.128013-2-stefanha@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | virtio-pci: enable blk and scsi multi-queue by default | expand |
On Wed, 27 May 2020 11:29:21 +0100 Stefan Hajnoczi <stefanha@redhat.com> wrote: > Multi-queue devices achieve the best performance when each vCPU has a > dedicated queue. This ensures that virtqueue used notifications are > handled on the same vCPU that submitted virtqueue buffers. When another > vCPU handles the the notification an IPI will be necessary to wake the > submission vCPU and this incurs a performance overhead. > > Provide a helper function that virtio-pci devices will use in later > patches to automatically select the optimal number of queues. > > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> > --- > hw/virtio/virtio-pci.h | 9 +++++++++ > hw/virtio/virtio-pci.c | 7 +++++++ > 2 files changed, 16 insertions(+) That looks like a good idea, since the policy can be easily tweaked in one place later. For ccw, I don't see a good way to arrive at an optimal number of queues. Is there something we should do for mmio? If yes, should this be a callback in VirtioBusClass?
On Thu, May 28, 2020 at 05:35:55PM +0200, Cornelia Huck wrote: > On Wed, 27 May 2020 11:29:21 +0100 > Stefan Hajnoczi <stefanha@redhat.com> wrote: > > > Multi-queue devices achieve the best performance when each vCPU has a > > dedicated queue. This ensures that virtqueue used notifications are > > handled on the same vCPU that submitted virtqueue buffers. When another > > vCPU handles the the notification an IPI will be necessary to wake the > > submission vCPU and this incurs a performance overhead. > > > > Provide a helper function that virtio-pci devices will use in later > > patches to automatically select the optimal number of queues. > > > > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> > > --- > > hw/virtio/virtio-pci.h | 9 +++++++++ > > hw/virtio/virtio-pci.c | 7 +++++++ > > 2 files changed, 16 insertions(+) > > That looks like a good idea, since the policy can be easily tweaked in > one place later. > > For ccw, I don't see a good way to arrive at an optimal number of > queues. Is there something we should do for mmio? If yes, should this > be a callback in VirtioBusClass? Stefan do you plan to address this?
On Thu, May 28, 2020 at 05:35:55PM +0200, Cornelia Huck wrote: > On Wed, 27 May 2020 11:29:21 +0100 > Stefan Hajnoczi <stefanha@redhat.com> wrote: > > > Multi-queue devices achieve the best performance when each vCPU has a > > dedicated queue. This ensures that virtqueue used notifications are > > handled on the same vCPU that submitted virtqueue buffers. When another > > vCPU handles the the notification an IPI will be necessary to wake the > > submission vCPU and this incurs a performance overhead. > > > > Provide a helper function that virtio-pci devices will use in later > > patches to automatically select the optimal number of queues. > > > > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> > > --- > > hw/virtio/virtio-pci.h | 9 +++++++++ > > hw/virtio/virtio-pci.c | 7 +++++++ > > 2 files changed, 16 insertions(+) > > That looks like a good idea, since the policy can be easily tweaked in > one place later. > > For ccw, I don't see a good way to arrive at an optimal number of > queues. Is there something we should do for mmio? If yes, should this > be a callback in VirtioBusClass? I looked at this but virtio-pci devices need to do num_queues -> num_vectors -> .realize() in that order. It's hard to introduce a meaningful VirtioBusClass method. (The problem is that some devices automatically calculate the number of PCI MSI-X vectors based on the number of queues, but that needs to happen before .realize() and involves PCI-specific qdev properties.) Trying to go through a common interface for all transports doesn't simplify things here. Stefan
On Mon, 6 Jul 2020 14:25:20 +0100 Stefan Hajnoczi <stefanha@redhat.com> wrote: > On Thu, May 28, 2020 at 05:35:55PM +0200, Cornelia Huck wrote: > > On Wed, 27 May 2020 11:29:21 +0100 > > Stefan Hajnoczi <stefanha@redhat.com> wrote: > > > > > Multi-queue devices achieve the best performance when each vCPU has a > > > dedicated queue. This ensures that virtqueue used notifications are > > > handled on the same vCPU that submitted virtqueue buffers. When another > > > vCPU handles the the notification an IPI will be necessary to wake the > > > submission vCPU and this incurs a performance overhead. > > > > > > Provide a helper function that virtio-pci devices will use in later > > > patches to automatically select the optimal number of queues. > > > > > > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> > > > --- > > > hw/virtio/virtio-pci.h | 9 +++++++++ > > > hw/virtio/virtio-pci.c | 7 +++++++ > > > 2 files changed, 16 insertions(+) > > > > That looks like a good idea, since the policy can be easily tweaked in > > one place later. > > > > For ccw, I don't see a good way to arrive at an optimal number of > > queues. Is there something we should do for mmio? If yes, should this > > be a callback in VirtioBusClass? > > I looked at this but virtio-pci devices need to do num_queues -> > num_vectors -> .realize() in that order. It's hard to introduce a > meaningful VirtioBusClass method. (The problem is that some devices > automatically calculate the number of PCI MSI-X vectors based on the > number of queues, but that needs to happen before .realize() and > involves PCI-specific qdev properties.) > > Trying to go through a common interface for all transports doesn't > simplify things here. That makes sense, thanks for checking.
diff --git a/hw/virtio/virtio-pci.h b/hw/virtio/virtio-pci.h index e2eaaa9182..91096f0291 100644 --- a/hw/virtio/virtio-pci.h +++ b/hw/virtio/virtio-pci.h @@ -243,4 +243,13 @@ typedef struct VirtioPCIDeviceTypeInfo { /* Register virtio-pci type(s). @t must be static. */ void virtio_pci_types_register(const VirtioPCIDeviceTypeInfo *t); +/** + * virtio_pci_optimal_num_queues: + * @fixed_queues: number of queues that are always present + * + * Returns: The optimal number of queues for a multi-queue device, excluding + * @fixed_queues. + */ +unsigned virtio_pci_optimal_num_queues(unsigned fixed_queues); + #endif diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c index d028c17c24..0c4f0100ca 100644 --- a/hw/virtio/virtio-pci.c +++ b/hw/virtio/virtio-pci.c @@ -19,6 +19,7 @@ #include "exec/memop.h" #include "standard-headers/linux/virtio_pci.h" +#include "hw/boards.h" #include "hw/virtio/virtio.h" #include "migration/qemu-file-types.h" #include "hw/pci/pci.h" @@ -2024,6 +2025,12 @@ void virtio_pci_types_register(const VirtioPCIDeviceTypeInfo *t) g_free(base_name); } +unsigned virtio_pci_optimal_num_queues(unsigned fixed_queues) +{ + /* 1:1 vq to vcpu mapping is ideal because it avoids IPIs */ + return MIN(current_machine->smp.cpus, VIRTIO_QUEUE_MAX - fixed_queues); +} + /* virtio-pci-bus */ static void virtio_pci_bus_new(VirtioBusState *bus, size_t bus_size,
Multi-queue devices achieve the best performance when each vCPU has a dedicated queue. This ensures that virtqueue used notifications are handled on the same vCPU that submitted virtqueue buffers. When another vCPU handles the the notification an IPI will be necessary to wake the submission vCPU and this incurs a performance overhead. Provide a helper function that virtio-pci devices will use in later patches to automatically select the optimal number of queues. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> --- hw/virtio/virtio-pci.h | 9 +++++++++ hw/virtio/virtio-pci.c | 7 +++++++ 2 files changed, 16 insertions(+)