diff mbox

vhost_net: stop polling socket during rx processing

Message ID 20160427141317-mutt-send-email-mst@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Michael S. Tsirkin April 27, 2016, 11:28 a.m. UTC
On Tue, Apr 26, 2016 at 03:35:53AM -0400, Jason Wang wrote:
> We don't stop polling socket during rx processing, this will lead
> unnecessary wakeups from under layer net devices (E.g
> sock_def_readable() form tun). Rx will be slowed down in this
> way. This patch avoids this by stop polling socket during rx
> processing. A small drawback is that this introduces some overheads in
> light load case because of the extra start/stop polling, but single
> netperf TCP_RR does not notice any change. In a super heavy load case,
> e.g using pktgen to inject packet to guest, we get about ~17%
> improvement on pps:
> 
> before: ~1370000 pkt/s
> after:  ~1500000 pkt/s
> 
> Signed-off-by: Jason Wang <jasowang@redhat.com>

Acked-by: Michael S. Tsirkin <mst@redhat.com>

There is one other possible enhancement: we actually have the wait queue
lock taken in _wake_up, but we give it up only to take it again in the
handler.

It would be nicer to just remove the entry when we wake
the vhost thread. Re-add it if required.
I think that something like the below would give you the necessary API.
Pls feel free to use it if you are going to implement a patch on top
doing this - that's not a reason not to include this simple patch
though.

--->

wait: add API to drop a wait_queue_t entry from wake up handler

A wake up handler might want to remove its own wait queue entry to avoid
future wakeups.  In particular, vhost has such a need.  As wait queue
lock is already taken, all we need is an API to remove the entry without
wait_queue_head_t which isn't currently accessible to wake up handlers.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

---

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Jason Wang April 28, 2016, 6:19 a.m. UTC | #1
On 04/27/2016 07:28 PM, Michael S. Tsirkin wrote:
> On Tue, Apr 26, 2016 at 03:35:53AM -0400, Jason Wang wrote:
>> We don't stop polling socket during rx processing, this will lead
>> unnecessary wakeups from under layer net devices (E.g
>> sock_def_readable() form tun). Rx will be slowed down in this
>> way. This patch avoids this by stop polling socket during rx
>> processing. A small drawback is that this introduces some overheads in
>> light load case because of the extra start/stop polling, but single
>> netperf TCP_RR does not notice any change. In a super heavy load case,
>> e.g using pktgen to inject packet to guest, we get about ~17%
>> improvement on pps:
>>
>> before: ~1370000 pkt/s
>> after:  ~1500000 pkt/s
>>
>> Signed-off-by: Jason Wang <jasowang@redhat.com>
> Acked-by: Michael S. Tsirkin <mst@redhat.com>
>
> There is one other possible enhancement: we actually have the wait queue
> lock taken in _wake_up, but we give it up only to take it again in the
> handler.
>
> It would be nicer to just remove the entry when we wake
> the vhost thread. Re-add it if required.
> I think that something like the below would give you the necessary API.
> Pls feel free to use it if you are going to implement a patch on top
> doing this - that's not a reason not to include this simple patch
> though.

Thanks, this looks useful, will give it a try.

>
> --->
>
> wait: add API to drop a wait_queue_t entry from wake up handler
>
> A wake up handler might want to remove its own wait queue entry to avoid
> future wakeups.  In particular, vhost has such a need.  As wait queue
> lock is already taken, all we need is an API to remove the entry without
> wait_queue_head_t which isn't currently accessible to wake up handlers.
>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>
> ---
>
> diff --git a/include/linux/wait.h b/include/linux/wait.h
> index 27d7a0a..9c6604b 100644
> --- a/include/linux/wait.h
> +++ b/include/linux/wait.h
> @@ -191,11 +191,17 @@ __add_wait_queue_tail_exclusive(wait_queue_head_t *q, wait_queue_t *wait)
>  }
>  
>  static inline void
> -__remove_wait_queue(wait_queue_head_t *head, wait_queue_t *old)
> +__remove_wait_queue_entry(wait_queue_t *old)
>  {
>  	list_del(&old->task_list);
>  }
>  
> +static inline void
> +__remove_wait_queue(wait_queue_head_t *head, wait_queue_t *old)
> +{
> +	__remove_wait_queue_entry(old);
> +}
> +
>  typedef int wait_bit_action_f(struct wait_bit_key *, int mode);
>  void __wake_up(wait_queue_head_t *q, unsigned int mode, int nr, void *key);
>  void __wake_up_locked_key(wait_queue_head_t *q, unsigned int mode, void *key);

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Wang May 3, 2016, 2:52 a.m. UTC | #2
On 04/28/2016 02:19 PM, Jason Wang wrote:
> On 04/27/2016 07:28 PM, Michael S. Tsirkin wrote:
>> > On Tue, Apr 26, 2016 at 03:35:53AM -0400, Jason Wang wrote:
>>> >> We don't stop polling socket during rx processing, this will lead
>>> >> unnecessary wakeups from under layer net devices (E.g
>>> >> sock_def_readable() form tun). Rx will be slowed down in this
>>> >> way. This patch avoids this by stop polling socket during rx
>>> >> processing. A small drawback is that this introduces some overheads in
>>> >> light load case because of the extra start/stop polling, but single
>>> >> netperf TCP_RR does not notice any change. In a super heavy load case,
>>> >> e.g using pktgen to inject packet to guest, we get about ~17%
>>> >> improvement on pps:
>>> >>
>>> >> before: ~1370000 pkt/s
>>> >> after:  ~1500000 pkt/s
>>> >>
>>> >> Signed-off-by: Jason Wang <jasowang@redhat.com>
>> > Acked-by: Michael S. Tsirkin <mst@redhat.com>
>> >
>> > There is one other possible enhancement: we actually have the wait queue
>> > lock taken in _wake_up, but we give it up only to take it again in the
>> > handler.
>> >
>> > It would be nicer to just remove the entry when we wake
>> > the vhost thread. Re-add it if required.
>> > I think that something like the below would give you the necessary API.
>> > Pls feel free to use it if you are going to implement a patch on top
>> > doing this - that's not a reason not to include this simple patch
>> > though.
> Thanks, this looks useful, will give it a try.

Want to try, but looks like this will result a strange API:

- poll were removed automatically during wakeup, handler does not need
to care about this
- but handler still need to re-add the poll explicitly in the code

?
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/linux/wait.h b/include/linux/wait.h
index 27d7a0a..9c6604b 100644
--- a/include/linux/wait.h
+++ b/include/linux/wait.h
@@ -191,11 +191,17 @@  __add_wait_queue_tail_exclusive(wait_queue_head_t *q, wait_queue_t *wait)
 }
 
 static inline void
-__remove_wait_queue(wait_queue_head_t *head, wait_queue_t *old)
+__remove_wait_queue_entry(wait_queue_t *old)
 {
 	list_del(&old->task_list);
 }
 
+static inline void
+__remove_wait_queue(wait_queue_head_t *head, wait_queue_t *old)
+{
+	__remove_wait_queue_entry(old);
+}
+
 typedef int wait_bit_action_f(struct wait_bit_key *, int mode);
 void __wake_up(wait_queue_head_t *q, unsigned int mode, int nr, void *key);
 void __wake_up_locked_key(wait_queue_head_t *q, unsigned int mode, void *key);