diff mbox

[6/6] vhost_net: remove the max pending check

Message ID 1376630190-5912-7-git-send-email-jasowang@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Jason Wang Aug. 16, 2013, 5:16 a.m. UTC
We used to limit the max pending DMAs to prevent guest from pinning too many
pages. But this could be removed since:

- We have the sk_wmem_alloc check in both tun/macvtap to do the same work
- This max pending check were almost useless since it was one done when there's
  no new buffers coming from guest. Guest can easily exceeds the limitation.
- We've already check upend_idx != done_idx and switch to non zerocopy then. So
  even if all vq->heads were used, we can still does the packet transmission.

So remove this check completely.

Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 drivers/vhost/net.c |   13 -------------
 1 files changed, 0 insertions(+), 13 deletions(-)

Comments

Michael S. Tsirkin Aug. 16, 2013, 10:02 a.m. UTC | #1
On Fri, Aug 16, 2013 at 01:16:30PM +0800, Jason Wang wrote:
> We used to limit the max pending DMAs to prevent guest from pinning too many
> pages. But this could be removed since:
> 
> - We have the sk_wmem_alloc check in both tun/macvtap to do the same work
> - This max pending check were almost useless since it was one done when there's
>   no new buffers coming from guest. Guest can easily exceeds the limitation.
> - We've already check upend_idx != done_idx and switch to non zerocopy then. So
>   even if all vq->heads were used, we can still does the packet transmission.

We can but performance will suffer.

> 
> So remove this check completely.
> 
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  drivers/vhost/net.c |   13 -------------
>  1 files changed, 0 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index a035a89..ed3f165 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -38,8 +38,6 @@ MODULE_PARM_DESC(experimental_zcopytx, "Enable Zero Copy TX;"
>   * Using this limit prevents one virtqueue from starving others. */
>  #define VHOST_NET_WEIGHT 0x80000
>  
> -/* MAX number of TX used buffers for outstanding zerocopy */
> -#define VHOST_MAX_PEND 128
>  #define VHOST_GOODCOPY_LEN 256
>  
>  /*
> @@ -372,17 +370,6 @@ static void handle_tx(struct vhost_net *net)
>  			break;
>  		/* Nothing new?  Wait for eventfd to tell us they refilled. */
>  		if (head == vq->num) {
> -			int num_pends;
> -
> -			/* If more outstanding DMAs, queue the work.
> -			 * Handle upend_idx wrap around
> -			 */
> -			num_pends = likely(nvq->upend_idx >= nvq->done_idx) ?
> -				    (nvq->upend_idx - nvq->done_idx) :
> -				    (nvq->upend_idx + UIO_MAXIOV -
> -				     nvq->done_idx);
> -			if (unlikely(num_pends > VHOST_MAX_PEND))
> -				break;
>  			if (unlikely(vhost_enable_notify(&net->dev, vq))) {
>  				vhost_disable_notify(&net->dev, vq);
>  				continue;
> -- 
> 1.7.1
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Wang Aug. 20, 2013, 2:48 a.m. UTC | #2
On 08/16/2013 06:02 PM, Michael S. Tsirkin wrote:
> On Fri, Aug 16, 2013 at 01:16:30PM +0800, Jason Wang wrote:
>> We used to limit the max pending DMAs to prevent guest from pinning too many
>> pages. But this could be removed since:
>>
>> - We have the sk_wmem_alloc check in both tun/macvtap to do the same work
>> - This max pending check were almost useless since it was one done when there's
>>   no new buffers coming from guest. Guest can easily exceeds the limitation.
>> - We've already check upend_idx != done_idx and switch to non zerocopy then. So
>>   even if all vq->heads were used, we can still does the packet transmission.
> We can but performance will suffer.

The check were in fact only done when no new buffers submitted from
guest. So if guest keep sending, the check won't be done.

If we really want to do this, we should do it unconditionally. Anyway, I
will do test to see the result.
>
>> So remove this check completely.
>>
>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>> ---
>>  drivers/vhost/net.c |   13 -------------
>>  1 files changed, 0 insertions(+), 13 deletions(-)
>>
>> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
>> index a035a89..ed3f165 100644
>> --- a/drivers/vhost/net.c
>> +++ b/drivers/vhost/net.c
>> @@ -38,8 +38,6 @@ MODULE_PARM_DESC(experimental_zcopytx, "Enable Zero Copy TX;"
>>   * Using this limit prevents one virtqueue from starving others. */
>>  #define VHOST_NET_WEIGHT 0x80000
>>  
>> -/* MAX number of TX used buffers for outstanding zerocopy */
>> -#define VHOST_MAX_PEND 128
>>  #define VHOST_GOODCOPY_LEN 256
>>  
>>  /*
>> @@ -372,17 +370,6 @@ static void handle_tx(struct vhost_net *net)
>>  			break;
>>  		/* Nothing new?  Wait for eventfd to tell us they refilled. */
>>  		if (head == vq->num) {
>> -			int num_pends;
>> -
>> -			/* If more outstanding DMAs, queue the work.
>> -			 * Handle upend_idx wrap around
>> -			 */
>> -			num_pends = likely(nvq->upend_idx >= nvq->done_idx) ?
>> -				    (nvq->upend_idx - nvq->done_idx) :
>> -				    (nvq->upend_idx + UIO_MAXIOV -
>> -				     nvq->done_idx);
>> -			if (unlikely(num_pends > VHOST_MAX_PEND))
>> -				break;
>>  			if (unlikely(vhost_enable_notify(&net->dev, vq))) {
>>  				vhost_disable_notify(&net->dev, vq);
>>  				continue;
>> -- 
>> 1.7.1
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Wang Aug. 23, 2013, 8:55 a.m. UTC | #3
On 08/20/2013 10:48 AM, Jason Wang wrote:
> On 08/16/2013 06:02 PM, Michael S. Tsirkin wrote:
>> > On Fri, Aug 16, 2013 at 01:16:30PM +0800, Jason Wang wrote:
>>> >> We used to limit the max pending DMAs to prevent guest from pinning too many
>>> >> pages. But this could be removed since:
>>> >>
>>> >> - We have the sk_wmem_alloc check in both tun/macvtap to do the same work
>>> >> - This max pending check were almost useless since it was one done when there's
>>> >>   no new buffers coming from guest. Guest can easily exceeds the limitation.
>>> >> - We've already check upend_idx != done_idx and switch to non zerocopy then. So
>>> >>   even if all vq->heads were used, we can still does the packet transmission.
>> > We can but performance will suffer.
> The check were in fact only done when no new buffers submitted from
> guest. So if guest keep sending, the check won't be done.
>
> If we really want to do this, we should do it unconditionally. Anyway, I
> will do test to see the result.

There's a bug in PATCH 5/6, the check:

nvq->upend_idx != nvq->done_idx

makes the zerocopy always been disabled since we initialize both
upend_idx and done_idx to zero. So I change it to:

(nvq->upend_idx + 1) % UIO_MAXIOV != nvq->done_idx.

With this change on top, I didn't see performance difference w/ and w/o
this patch.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael S. Tsirkin Aug. 25, 2013, 11:53 a.m. UTC | #4
On Fri, Aug 23, 2013 at 04:55:49PM +0800, Jason Wang wrote:
> On 08/20/2013 10:48 AM, Jason Wang wrote:
> > On 08/16/2013 06:02 PM, Michael S. Tsirkin wrote:
> >> > On Fri, Aug 16, 2013 at 01:16:30PM +0800, Jason Wang wrote:
> >>> >> We used to limit the max pending DMAs to prevent guest from pinning too many
> >>> >> pages. But this could be removed since:
> >>> >>
> >>> >> - We have the sk_wmem_alloc check in both tun/macvtap to do the same work
> >>> >> - This max pending check were almost useless since it was one done when there's
> >>> >>   no new buffers coming from guest. Guest can easily exceeds the limitation.
> >>> >> - We've already check upend_idx != done_idx and switch to non zerocopy then. So
> >>> >>   even if all vq->heads were used, we can still does the packet transmission.
> >> > We can but performance will suffer.
> > The check were in fact only done when no new buffers submitted from
> > guest. So if guest keep sending, the check won't be done.
> >
> > If we really want to do this, we should do it unconditionally. Anyway, I
> > will do test to see the result.
> 
> There's a bug in PATCH 5/6, the check:
> 
> nvq->upend_idx != nvq->done_idx
> 
> makes the zerocopy always been disabled since we initialize both
> upend_idx and done_idx to zero. So I change it to:
> 
> (nvq->upend_idx + 1) % UIO_MAXIOV != nvq->done_idx.

But what I would really like to try is limit ubuf_info to VHOST_MAX_PEND.
I think this has a chance to improve performance since
we'll be using less cache.
Of course this means we must fix the code to really never submit
more than VHOST_MAX_PEND requests.

Want to try?
> 
> With this change on top, I didn't see performance difference w/ and w/o
> this patch.

Did you try small message sizes btw (like 1K)? Or just netperf
default of 64K?
Jason Wang Aug. 26, 2013, 7 a.m. UTC | #5
On 08/25/2013 07:53 PM, Michael S. Tsirkin wrote:
> On Fri, Aug 23, 2013 at 04:55:49PM +0800, Jason Wang wrote:
>> On 08/20/2013 10:48 AM, Jason Wang wrote:
>>> On 08/16/2013 06:02 PM, Michael S. Tsirkin wrote:
>>>>> On Fri, Aug 16, 2013 at 01:16:30PM +0800, Jason Wang wrote:
>>>>>>> We used to limit the max pending DMAs to prevent guest from pinning too many
>>>>>>> pages. But this could be removed since:
>>>>>>>
>>>>>>> - We have the sk_wmem_alloc check in both tun/macvtap to do the same work
>>>>>>> - This max pending check were almost useless since it was one done when there's
>>>>>>>   no new buffers coming from guest. Guest can easily exceeds the limitation.
>>>>>>> - We've already check upend_idx != done_idx and switch to non zerocopy then. So
>>>>>>>   even if all vq->heads were used, we can still does the packet transmission.
>>>>> We can but performance will suffer.
>>> The check were in fact only done when no new buffers submitted from
>>> guest. So if guest keep sending, the check won't be done.
>>>
>>> If we really want to do this, we should do it unconditionally. Anyway, I
>>> will do test to see the result.
>> There's a bug in PATCH 5/6, the check:
>>
>> nvq->upend_idx != nvq->done_idx
>>
>> makes the zerocopy always been disabled since we initialize both
>> upend_idx and done_idx to zero. So I change it to:
>>
>> (nvq->upend_idx + 1) % UIO_MAXIOV != nvq->done_idx.
> But what I would really like to try is limit ubuf_info to VHOST_MAX_PEND.
> I think this has a chance to improve performance since
> we'll be using less cache.

Maybe, but it in fact decrease the vq size to VHOST_MAX_PEND.
> Of course this means we must fix the code to really never submit
> more than VHOST_MAX_PEND requests.
>
> Want to try?

Ok, sure.
>> With this change on top, I didn't see performance difference w/ and w/o
>> this patch.
> Did you try small message sizes btw (like 1K)? Or just netperf
> default of 64K?
>

I just test multiple sessions of TCP_RR. Will test TCP_STREAM also.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Wang Aug. 30, 2013, 3:23 a.m. UTC | #6
On 08/25/2013 07:53 PM, Michael S. Tsirkin wrote:
> On Fri, Aug 23, 2013 at 04:55:49PM +0800, Jason Wang wrote:
>> On 08/20/2013 10:48 AM, Jason Wang wrote:
>>> On 08/16/2013 06:02 PM, Michael S. Tsirkin wrote:
>>>>> On Fri, Aug 16, 2013 at 01:16:30PM +0800, Jason Wang wrote:
>>>>>>> We used to limit the max pending DMAs to prevent guest from pinning too many
>>>>>>> pages. But this could be removed since:
>>>>>>>
>>>>>>> - We have the sk_wmem_alloc check in both tun/macvtap to do the same work
>>>>>>> - This max pending check were almost useless since it was one done when there's
>>>>>>>   no new buffers coming from guest. Guest can easily exceeds the limitation.
>>>>>>> - We've already check upend_idx != done_idx and switch to non zerocopy then. So
>>>>>>>   even if all vq->heads were used, we can still does the packet transmission.
>>>>> We can but performance will suffer.
>>> The check were in fact only done when no new buffers submitted from
>>> guest. So if guest keep sending, the check won't be done.
>>>
>>> If we really want to do this, we should do it unconditionally. Anyway, I
>>> will do test to see the result.
>> There's a bug in PATCH 5/6, the check:
>>
>> nvq->upend_idx != nvq->done_idx
>>
>> makes the zerocopy always been disabled since we initialize both
>> upend_idx and done_idx to zero. So I change it to:
>>
>> (nvq->upend_idx + 1) % UIO_MAXIOV != nvq->done_idx.
> But what I would really like to try is limit ubuf_info to VHOST_MAX_PEND.
> I think this has a chance to improve performance since
> we'll be using less cache.
> Of course this means we must fix the code to really never submit
> more than VHOST_MAX_PEND requests.
>
> Want to try?

The result is, I see about 5%-10% improvement for per cpu throughput on
guest tx. But about 5% degradation on per cpu transaction rate on TCP_RR.
>> With this change on top, I didn't see performance difference w/ and w/o
>> this patch.
> Did you try small message sizes btw (like 1K)? Or just netperf
> default of 64K?
>

5%-10% improvement on for per cpu throughput on guest rx, but some
regressions (5%) on guest tx. So we'd better keep and make it doing
properly.

Will post V2 for your reviewing.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index a035a89..ed3f165 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -38,8 +38,6 @@  MODULE_PARM_DESC(experimental_zcopytx, "Enable Zero Copy TX;"
  * Using this limit prevents one virtqueue from starving others. */
 #define VHOST_NET_WEIGHT 0x80000
 
-/* MAX number of TX used buffers for outstanding zerocopy */
-#define VHOST_MAX_PEND 128
 #define VHOST_GOODCOPY_LEN 256
 
 /*
@@ -372,17 +370,6 @@  static void handle_tx(struct vhost_net *net)
 			break;
 		/* Nothing new?  Wait for eventfd to tell us they refilled. */
 		if (head == vq->num) {
-			int num_pends;
-
-			/* If more outstanding DMAs, queue the work.
-			 * Handle upend_idx wrap around
-			 */
-			num_pends = likely(nvq->upend_idx >= nvq->done_idx) ?
-				    (nvq->upend_idx - nvq->done_idx) :
-				    (nvq->upend_idx + UIO_MAXIOV -
-				     nvq->done_idx);
-			if (unlikely(num_pends > VHOST_MAX_PEND))
-				break;
 			if (unlikely(vhost_enable_notify(&net->dev, vq))) {
 				vhost_disable_notify(&net->dev, vq);
 				continue;