Message ID | 1470038878-5599-1-git-send-email-jasowang@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, Aug 01, 2016 at 04:07:58PM +0800, Jason Wang wrote: > We used to set vring call fd unconditionally even if guest driver does > not use MSIX for this vritqueue at all. This will cause lots of > unnecessary userspace access and other checks for drivers does not use > interrupt at all (e.g virtio-net pmd). So check and clean vring call > fd if guest does not use any vector for this virtqueue at > all. > > Perf diffs (on rx) shows lots of cpus wasted on vhost_signal() were saved: > > # > 28.12% -27.82% [vhost] [k] vhost_signal > 14.44% -1.69% [kernel.vmlinux] [k] copy_user_generic_string > 7.05% +1.53% [kernel.vmlinux] [k] __free_page_frag > 6.51% +5.53% [vhost] [k] vhost_get_vq_desc > ... > > Pktgen tests shows 15.8% improvement on rx pps and 6.5% on tx pps. > > Before: RX 2.08Mpps TX 1.35Mpps > After: RX 2.41Mpps TX 1.44Mpps > > Signed-off-by: Jason Wang <jasowang@redhat.com> > --- > hw/virtio/vhost.c | 14 ++++++++++++++ > 1 file changed, 14 insertions(+) > > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c > index 3d0c807..bd051ab 100644 > --- a/hw/virtio/vhost.c > +++ b/hw/virtio/vhost.c > @@ -822,6 +822,9 @@ static int vhost_virtqueue_start(struct vhost_dev *dev, > struct vhost_virtqueue *vq, > unsigned idx) > { > + BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev))); > + VirtioBusState *vbus = VIRTIO_BUS(qbus); > + VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(vbus); > hwaddr s, l, a; > int r; > int vhost_vq_index = dev->vhost_ops->vhost_get_vq_index(dev, idx); > @@ -912,8 +915,19 @@ static int vhost_virtqueue_start(struct vhost_dev *dev, > vhost_virtqueue_mask(dev, vdev, idx, false); > } > > + if (k->query_guest_notifiers && > + k->query_guest_notifiers(qbus->parent) && > + virtio_queue_vector(vdev, idx) == VIRTIO_NO_VECTOR) { > + file.fd = -1; > + r = dev->vhost_ops->vhost_set_vring_call(dev, &file); > + if (r) { > + goto fail_vector; > + } > + } > + It's rather asymmetrical though. Wouldn't a better place be in vhost_virtqueue_mask? > return 0; > > +fail_vector: > fail_kick: > fail_alloc: > cpu_physical_memory_unmap(vq->ring, virtio_queue_get_ring_size(vdev, idx), > -- > 2.7.4
On Mon, 1 Aug 2016 16:07:58 +0800 Jason Wang <jasowang@redhat.com> wrote: > We used to set vring call fd unconditionally even if guest driver does > not use MSIX for this vritqueue at all. This will cause lots of > unnecessary userspace access and other checks for drivers does not use > interrupt at all (e.g virtio-net pmd). So check and clean vring call > fd if guest does not use any vector for this virtqueue at > all. So the basic idea is: don't setup signalling via eventfd if the guest did not enable interrupts for this queue, right? > > Perf diffs (on rx) shows lots of cpus wasted on vhost_signal() were saved: > > # > 28.12% -27.82% [vhost] [k] vhost_signal > 14.44% -1.69% [kernel.vmlinux] [k] copy_user_generic_string > 7.05% +1.53% [kernel.vmlinux] [k] __free_page_frag > 6.51% +5.53% [vhost] [k] vhost_get_vq_desc > ... > > Pktgen tests shows 15.8% improvement on rx pps and 6.5% on tx pps. > > Before: RX 2.08Mpps TX 1.35Mpps > After: RX 2.41Mpps TX 1.44Mpps > > Signed-off-by: Jason Wang <jasowang@redhat.com> > --- > hw/virtio/vhost.c | 14 ++++++++++++++ > 1 file changed, 14 insertions(+) > > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c > index 3d0c807..bd051ab 100644 > --- a/hw/virtio/vhost.c > +++ b/hw/virtio/vhost.c > @@ -822,6 +822,9 @@ static int vhost_virtqueue_start(struct vhost_dev *dev, > struct vhost_virtqueue *vq, > unsigned idx) > { > + BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev))); > + VirtioBusState *vbus = VIRTIO_BUS(qbus); > + VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(vbus); > hwaddr s, l, a; > int r; > int vhost_vq_index = dev->vhost_ops->vhost_get_vq_index(dev, idx); > @@ -912,8 +915,19 @@ static int vhost_virtqueue_start(struct vhost_dev *dev, > vhost_virtqueue_mask(dev, vdev, idx, false); > } > > + if (k->query_guest_notifiers && > + k->query_guest_notifiers(qbus->parent) && > + virtio_queue_vector(vdev, idx) == VIRTIO_NO_VECTOR) { I'm trying to imagine what this means for virtio-ccw. Keep in mind that we don't have the concept of setting a 'vector' by the OS (the vector is setup internally to the queue index and the OS does not see it.) ->query_guest_notifiers() is true if the OS has enabled the subchannel of the proxy device (i.e., if it is enabled for doing *anything* with the subchannel, regardless whether the OS wants to be notified or is planning to poll.) The second condition will never hold true for any valid queue once the OS has setup the queues. So this won't break anything for virtio-ccw AFAICS, but I don't think we gain anything. > + file.fd = -1; > + r = dev->vhost_ops->vhost_set_vring_call(dev, &file); > + if (r) { > + goto fail_vector; > + } > + } > + > return 0; > > +fail_vector: > fail_kick: > fail_alloc: > cpu_physical_memory_unmap(vq->ring, virtio_queue_get_ring_size(vdev, idx),
On 2016年08月01日 22:08, Michael S. Tsirkin wrote: > On Mon, Aug 01, 2016 at 04:07:58PM +0800, Jason Wang wrote: >> We used to set vring call fd unconditionally even if guest driver does >> not use MSIX for this vritqueue at all. This will cause lots of >> unnecessary userspace access and other checks for drivers does not use >> interrupt at all (e.g virtio-net pmd). So check and clean vring call >> fd if guest does not use any vector for this virtqueue at >> all. >> >> Perf diffs (on rx) shows lots of cpus wasted on vhost_signal() were saved: >> >> # >> 28.12% -27.82% [vhost] [k] vhost_signal >> 14.44% -1.69% [kernel.vmlinux] [k] copy_user_generic_string >> 7.05% +1.53% [kernel.vmlinux] [k] __free_page_frag >> 6.51% +5.53% [vhost] [k] vhost_get_vq_desc >> ... >> >> Pktgen tests shows 15.8% improvement on rx pps and 6.5% on tx pps. >> >> Before: RX 2.08Mpps TX 1.35Mpps >> After: RX 2.41Mpps TX 1.44Mpps >> >> Signed-off-by: Jason Wang <jasowang@redhat.com> >> --- >> hw/virtio/vhost.c | 14 ++++++++++++++ >> 1 file changed, 14 insertions(+) >> >> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c >> index 3d0c807..bd051ab 100644 >> --- a/hw/virtio/vhost.c >> +++ b/hw/virtio/vhost.c >> @@ -822,6 +822,9 @@ static int vhost_virtqueue_start(struct vhost_dev *dev, >> struct vhost_virtqueue *vq, >> unsigned idx) >> { >> + BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev))); >> + VirtioBusState *vbus = VIRTIO_BUS(qbus); >> + VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(vbus); >> hwaddr s, l, a; >> int r; >> int vhost_vq_index = dev->vhost_ops->vhost_get_vq_index(dev, idx); >> @@ -912,8 +915,19 @@ static int vhost_virtqueue_start(struct vhost_dev *dev, >> vhost_virtqueue_mask(dev, vdev, idx, false); >> } >> >> + if (k->query_guest_notifiers && >> + k->query_guest_notifiers(qbus->parent) && >> + virtio_queue_vector(vdev, idx) == VIRTIO_NO_VECTOR) { >> + file.fd = -1; >> + r = dev->vhost_ops->vhost_set_vring_call(dev, &file); >> + if (r) { >> + goto fail_vector; >> + } >> + } >> + > It's rather asymmetrical though. Wouldn't a better place > be in vhost_virtqueue_mask? Looks not, since it will be only called when - backend does not support interrupt masking (vhost-user), the patch should work for vhost-kernel too - mask or unmask an vector but the patch only work for a virtqueue of VIRTIO_NO_VECTOR > > >> return 0; >> >> +fail_vector: >> fail_kick: >> fail_alloc: >> cpu_physical_memory_unmap(vq->ring, virtio_queue_get_ring_size(vdev, idx), >> -- >> 2.7.4
On 2016年08月02日 02:00, Cornelia Huck wrote: > On Mon, 1 Aug 2016 16:07:58 +0800 > Jason Wang <jasowang@redhat.com> wrote: > >> We used to set vring call fd unconditionally even if guest driver does >> not use MSIX for this vritqueue at all. This will cause lots of >> unnecessary userspace access and other checks for drivers does not use >> interrupt at all (e.g virtio-net pmd). So check and clean vring call >> fd if guest does not use any vector for this virtqueue at >> all. > So the basic idea is: don't setup signalling via eventfd if the guest > did not enable interrupts for this queue, right? Right. >> Perf diffs (on rx) shows lots of cpus wasted on vhost_signal() were saved: >> >> # >> 28.12% -27.82% [vhost] [k] vhost_signal >> 14.44% -1.69% [kernel.vmlinux] [k] copy_user_generic_string >> 7.05% +1.53% [kernel.vmlinux] [k] __free_page_frag >> 6.51% +5.53% [vhost] [k] vhost_get_vq_desc >> ... >> >> Pktgen tests shows 15.8% improvement on rx pps and 6.5% on tx pps. >> >> Before: RX 2.08Mpps TX 1.35Mpps >> After: RX 2.41Mpps TX 1.44Mpps >> >> Signed-off-by: Jason Wang <jasowang@redhat.com> >> --- >> hw/virtio/vhost.c | 14 ++++++++++++++ >> 1 file changed, 14 insertions(+) >> >> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c >> index 3d0c807..bd051ab 100644 >> --- a/hw/virtio/vhost.c >> +++ b/hw/virtio/vhost.c >> @@ -822,6 +822,9 @@ static int vhost_virtqueue_start(struct vhost_dev *dev, >> struct vhost_virtqueue *vq, >> unsigned idx) >> { >> + BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev))); >> + VirtioBusState *vbus = VIRTIO_BUS(qbus); >> + VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(vbus); >> hwaddr s, l, a; >> int r; >> int vhost_vq_index = dev->vhost_ops->vhost_get_vq_index(dev, idx); >> @@ -912,8 +915,19 @@ static int vhost_virtqueue_start(struct vhost_dev *dev, >> vhost_virtqueue_mask(dev, vdev, idx, false); >> } >> >> + if (k->query_guest_notifiers && >> + k->query_guest_notifiers(qbus->parent) && >> + virtio_queue_vector(vdev, idx) == VIRTIO_NO_VECTOR) { > I'm trying to imagine what this means for virtio-ccw. Keep in mind that > we don't have the concept of setting a 'vector' by the OS (the vector > is setup internally to the queue index and the OS does not see it.) > > ->query_guest_notifiers() is true if the OS has enabled the subchannel > of the proxy device (i.e., if it is enabled for doing *anything* with > the subchannel, regardless whether the OS wants to be notified or is > planning to poll.) The second condition will never hold true for any > valid queue once the OS has setup the queues. I see, so If I understand correctly, there's no way to detect whether or not guest will use a specific virtqueue interrupt? > > So this won't break anything for virtio-ccw AFAICS, but I don't think > we gain anything. Yes, but if we could infer whether or not polling is used in the driver, this is probably all we can do for ccw. Thanks > >> + file.fd = -1; >> + r = dev->vhost_ops->vhost_set_vring_call(dev, &file); >> + if (r) { >> + goto fail_vector; >> + } >> + } >> + >> return 0; >> >> +fail_vector: >> fail_kick: >> fail_alloc: >> cpu_physical_memory_unmap(vq->ring, virtio_queue_get_ring_size(vdev, idx),
On Tue, 2 Aug 2016 10:39:22 +0800 Jason Wang <jasowang@redhat.com> wrote: > On 2016年08月02日 02:00, Cornelia Huck wrote: > > On Mon, 1 Aug 2016 16:07:58 +0800 > > Jason Wang <jasowang@redhat.com> wrote: > >> + if (k->query_guest_notifiers && > >> + k->query_guest_notifiers(qbus->parent) && > >> + virtio_queue_vector(vdev, idx) == VIRTIO_NO_VECTOR) { > > I'm trying to imagine what this means for virtio-ccw. Keep in mind that > > we don't have the concept of setting a 'vector' by the OS (the vector > > is setup internally to the queue index and the OS does not see it.) > > > > ->query_guest_notifiers() is true if the OS has enabled the subchannel > > of the proxy device (i.e., if it is enabled for doing *anything* with > > the subchannel, regardless whether the OS wants to be notified or is > > planning to poll.) The second condition will never hold true for any > > valid queue once the OS has setup the queues. > > I see, so If I understand correctly, there's no way to detect whether or > not guest will use a specific virtqueue interrupt? Yes. The guest will either be notified for any virtqueue (if it registered indicators; this is always done for every vq of the device at once), or for none. > > > > > So this won't break anything for virtio-ccw AFAICS, but I don't think > > we gain anything. > > Yes, but if we could infer whether or not polling is used in the driver, > this is probably all we can do for ccw. What we could do is check whether the driver has registed indicators and disable setting up notification for any vq of the device if not. But I'm not sure an always-polling driver is worth optimizing for.
On 2016年08月02日 14:37, Cornelia Huck wrote: > On Tue, 2 Aug 2016 10:39:22 +0800 > Jason Wang <jasowang@redhat.com> wrote: > >> On 2016年08月02日 02:00, Cornelia Huck wrote: >>> On Mon, 1 Aug 2016 16:07:58 +0800 >>> Jason Wang <jasowang@redhat.com> wrote: >>>> + if (k->query_guest_notifiers && >>>> + k->query_guest_notifiers(qbus->parent) && >>>> + virtio_queue_vector(vdev, idx) == VIRTIO_NO_VECTOR) { >>> I'm trying to imagine what this means for virtio-ccw. Keep in mind that >>> we don't have the concept of setting a 'vector' by the OS (the vector >>> is setup internally to the queue index and the OS does not see it.) >>> >>> ->query_guest_notifiers() is true if the OS has enabled the subchannel >>> of the proxy device (i.e., if it is enabled for doing *anything* with >>> the subchannel, regardless whether the OS wants to be notified or is >>> planning to poll.) The second condition will never hold true for any >>> valid queue once the OS has setup the queues. >> I see, so If I understand correctly, there's no way to detect whether or >> not guest will use a specific virtqueue interrupt? > Yes. The guest will either be notified for any virtqueue (if it > registered indicators; this is always done for every vq of the device > at once), or for none. > >>> So this won't break anything for virtio-ccw AFAICS, but I don't think >>> we gain anything. >> Yes, but if we could infer whether or not polling is used in the driver, >> this is probably all we can do for ccw. > What we could do is check whether the driver has registed indicators > and disable setting up notification for any vq of the device if not. > But I'm not sure an always-polling driver is worth optimizing for. It's worth for at least pci transport. Consider virito-net pmd in guest, a NULL vring call can save unnecessary userspace memory access and memory barriers.
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c index 3d0c807..bd051ab 100644 --- a/hw/virtio/vhost.c +++ b/hw/virtio/vhost.c @@ -822,6 +822,9 @@ static int vhost_virtqueue_start(struct vhost_dev *dev, struct vhost_virtqueue *vq, unsigned idx) { + BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev))); + VirtioBusState *vbus = VIRTIO_BUS(qbus); + VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(vbus); hwaddr s, l, a; int r; int vhost_vq_index = dev->vhost_ops->vhost_get_vq_index(dev, idx); @@ -912,8 +915,19 @@ static int vhost_virtqueue_start(struct vhost_dev *dev, vhost_virtqueue_mask(dev, vdev, idx, false); } + if (k->query_guest_notifiers && + k->query_guest_notifiers(qbus->parent) && + virtio_queue_vector(vdev, idx) == VIRTIO_NO_VECTOR) { + file.fd = -1; + r = dev->vhost_ops->vhost_set_vring_call(dev, &file); + if (r) { + goto fail_vector; + } + } + return 0; +fail_vector: fail_kick: fail_alloc: cpu_physical_memory_unmap(vq->ring, virtio_queue_get_ring_size(vdev, idx),
We used to set vring call fd unconditionally even if guest driver does not use MSIX for this vritqueue at all. This will cause lots of unnecessary userspace access and other checks for drivers does not use interrupt at all (e.g virtio-net pmd). So check and clean vring call fd if guest does not use any vector for this virtqueue at all. Perf diffs (on rx) shows lots of cpus wasted on vhost_signal() were saved: # 28.12% -27.82% [vhost] [k] vhost_signal 14.44% -1.69% [kernel.vmlinux] [k] copy_user_generic_string 7.05% +1.53% [kernel.vmlinux] [k] __free_page_frag 6.51% +5.53% [vhost] [k] vhost_get_vq_desc ... Pktgen tests shows 15.8% improvement on rx pps and 6.5% on tx pps. Before: RX 2.08Mpps TX 1.35Mpps After: RX 2.41Mpps TX 1.44Mpps Signed-off-by: Jason Wang <jasowang@redhat.com> --- hw/virtio/vhost.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+)