Message ID | 1376630190-5912-3-git-send-email-jasowang@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, Aug 16, 2013 at 01:16:26PM +0800, Jason Wang wrote: > Switch to use vhost_add_used_and_signal_n() to avoid multiple calls to > vhost_add_used_and_signal(). With the patch we will call at most 2 times > (consider done_idx warp around) compared to N times w/o this patch. > > Signed-off-by: Jason Wang <jasowang@redhat.com> So? Does this help performance then? > --- > drivers/vhost/net.c | 13 ++++++++----- > 1 files changed, 8 insertions(+), 5 deletions(-) > > diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c > index 280ee66..8a6dd0d 100644 > --- a/drivers/vhost/net.c > +++ b/drivers/vhost/net.c > @@ -281,7 +281,7 @@ static void vhost_zerocopy_signal_used(struct vhost_net *net, > { > struct vhost_net_virtqueue *nvq = > container_of(vq, struct vhost_net_virtqueue, vq); > - int i; > + int i, add; > int j = 0; > > for (i = nvq->done_idx; i != nvq->upend_idx; i = (i + 1) % UIO_MAXIOV) { > @@ -289,14 +289,17 @@ static void vhost_zerocopy_signal_used(struct vhost_net *net, > vhost_net_tx_err(net); > if (VHOST_DMA_IS_DONE(vq->heads[i].len)) { > vq->heads[i].len = VHOST_DMA_CLEAR_LEN; > - vhost_add_used_and_signal(vq->dev, vq, > - vq->heads[i].id, 0); > ++j; > } else > break; > } > - if (j) > - nvq->done_idx = i; > + while (j) { > + add = min(UIO_MAXIOV - nvq->done_idx, j); > + vhost_add_used_and_signal_n(vq->dev, vq, > + &vq->heads[nvq->done_idx], add); > + nvq->done_idx = (nvq->done_idx + add) % UIO_MAXIOV; > + j -= add; > + } > } > > static void vhost_zerocopy_callback(struct ubuf_info *ubuf, bool success) > -- > 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 08/16/2013 05:54 PM, Michael S. Tsirkin wrote: > On Fri, Aug 16, 2013 at 01:16:26PM +0800, Jason Wang wrote: >> > Switch to use vhost_add_used_and_signal_n() to avoid multiple calls to >> > vhost_add_used_and_signal(). With the patch we will call at most 2 times >> > (consider done_idx warp around) compared to N times w/o this patch. >> > >> > Signed-off-by: Jason Wang <jasowang@redhat.com> > So? Does this help performance then? > Looks like it can especially when guest does support event index. When guest enable tx interrupt, this can saves us some unnecessary signal to guest. I will do some test. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 08/20/2013 10:33 AM, Jason Wang wrote: > On 08/16/2013 05:54 PM, Michael S. Tsirkin wrote: >> On Fri, Aug 16, 2013 at 01:16:26PM +0800, Jason Wang wrote: >>>> Switch to use vhost_add_used_and_signal_n() to avoid multiple calls to >>>> vhost_add_used_and_signal(). With the patch we will call at most 2 times >>>> (consider done_idx warp around) compared to N times w/o this patch. >>>> >>>> Signed-off-by: Jason Wang <jasowang@redhat.com> >> So? Does this help performance then? >> > Looks like it can especially when guest does support event index. When > guest enable tx interrupt, this can saves us some unnecessary signal to > guest. I will do some test. Have done some test. I can see 2% - 3% increasing in both aggregate transaction rate and per cpu transaction rate in TCP_RR and UDP_RR test. I'm using ixgbe. W/o this patch, I can see more than 100 calls of vhost_add_used_signal() in one vhost_zerocopy_signaled_used(). This is because ixgbe (and other modern ethernet driver) tends to free old tx skbs in a loop during tx interrupt, and vhost tend to batch the adding used and signal in vhost_zerocopy_callback(). Switching to use vhost_add_use_and_signal_n() means saving 100 times of used idx updating and memory barriers. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Aug 23, 2013 at 04:50:38PM +0800, Jason Wang wrote: > On 08/20/2013 10:33 AM, Jason Wang wrote: > > On 08/16/2013 05:54 PM, Michael S. Tsirkin wrote: > >> On Fri, Aug 16, 2013 at 01:16:26PM +0800, Jason Wang wrote: > >>>> Switch to use vhost_add_used_and_signal_n() to avoid multiple calls to > >>>> vhost_add_used_and_signal(). With the patch we will call at most 2 times > >>>> (consider done_idx warp around) compared to N times w/o this patch. > >>>> > >>>> Signed-off-by: Jason Wang <jasowang@redhat.com> > >> So? Does this help performance then? > >> > > Looks like it can especially when guest does support event index. When > > guest enable tx interrupt, this can saves us some unnecessary signal to > > guest. I will do some test. > > Have done some test. I can see 2% - 3% increasing in both aggregate > transaction rate and per cpu transaction rate in TCP_RR and UDP_RR test. > > I'm using ixgbe. W/o this patch, I can see more than 100 calls of > vhost_add_used_signal() in one vhost_zerocopy_signaled_used(). This is > because ixgbe (and other modern ethernet driver) tends to free old tx > skbs in a loop during tx interrupt, and vhost tend to batch the adding > used and signal in vhost_zerocopy_callback(). Switching to use > vhost_add_use_and_signal_n() means saving 100 times of used idx updating > and memory barriers. Well it's only smp_wmb so a nop on most architectures, so a 2% gain is surprising. I'm guessing the cache miss on the write is what's giving us a speedup here. I'll review the code, thanks.
diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index 280ee66..8a6dd0d 100644 --- a/drivers/vhost/net.c +++ b/drivers/vhost/net.c @@ -281,7 +281,7 @@ static void vhost_zerocopy_signal_used(struct vhost_net *net, { struct vhost_net_virtqueue *nvq = container_of(vq, struct vhost_net_virtqueue, vq); - int i; + int i, add; int j = 0; for (i = nvq->done_idx; i != nvq->upend_idx; i = (i + 1) % UIO_MAXIOV) { @@ -289,14 +289,17 @@ static void vhost_zerocopy_signal_used(struct vhost_net *net, vhost_net_tx_err(net); if (VHOST_DMA_IS_DONE(vq->heads[i].len)) { vq->heads[i].len = VHOST_DMA_CLEAR_LEN; - vhost_add_used_and_signal(vq->dev, vq, - vq->heads[i].id, 0); ++j; } else break; } - if (j) - nvq->done_idx = i; + while (j) { + add = min(UIO_MAXIOV - nvq->done_idx, j); + vhost_add_used_and_signal_n(vq->dev, vq, + &vq->heads[nvq->done_idx], add); + nvq->done_idx = (nvq->done_idx + add) % UIO_MAXIOV; + j -= add; + } } static void vhost_zerocopy_callback(struct ubuf_info *ubuf, bool success)
Switch to use vhost_add_used_and_signal_n() to avoid multiple calls to vhost_add_used_and_signal(). With the patch we will call at most 2 times (consider done_idx warp around) compared to N times w/o this patch. Signed-off-by: Jason Wang <jasowang@redhat.com> --- drivers/vhost/net.c | 13 ++++++++----- 1 files changed, 8 insertions(+), 5 deletions(-)