diff mbox series

[net,v4,2/2] vhost_net: fix tx queue stuck when sendmsg fails

Message ID 1608776746-4040-1-git-send-email-wangyunjian@huawei.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series fixes for vhost_net | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for net
netdev/subject_prefix success Link
netdev/cc_maintainers warning 5 maintainers not CCed: kvm@vger.kernel.org davem@davemloft.net rusty@rustcorp.com.au arnd@arndb.de paulmck@kernel.org
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 40 lines checked
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/header_inline success Link
netdev/stable success Stable not CCed

Commit Message

wangyunjian Dec. 24, 2020, 2:25 a.m. UTC
From: Yunjian Wang <wangyunjian@huawei.com>

Currently the driver doesn't drop a packet which can't be sent by tun
(e.g bad packet). In this case, the driver will always process the
same packet lead to the tx queue stuck.

To fix this issue:
1. in the case of persistent failure (e.g bad packet), the driver
can skip this descriptor by ignoring the error.
2. in the case of transient failure (e.g -EAGAIN and -ENOMEM), the
driver schedules the worker to try again.

Fixes: 3a4d5c94e959 ("vhost_net: a kernel-level virtio server")
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
---
 drivers/vhost/net.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

Comments

Willem de Bruijn Dec. 24, 2020, 2:39 a.m. UTC | #1
On Wed, Dec 23, 2020 at 9:25 PM wangyunjian <wangyunjian@huawei.com> wrote:
>
> From: Yunjian Wang <wangyunjian@huawei.com>
>
> Currently the driver doesn't drop a packet which can't be sent by tun
> (e.g bad packet). In this case, the driver will always process the
> same packet lead to the tx queue stuck.
>
> To fix this issue:
> 1. in the case of persistent failure (e.g bad packet), the driver
> can skip this descriptor by ignoring the error.
> 2. in the case of transient failure (e.g -EAGAIN and -ENOMEM), the
> driver schedules the worker to try again.
>
> Fixes: 3a4d5c94e959 ("vhost_net: a kernel-level virtio server")
> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>

Acked-by: Willem de Bruijn <willemb@google.com>

Thanks.
Jason Wang Dec. 24, 2020, 3:10 a.m. UTC | #2
On 2020/12/24 上午10:25, wangyunjian wrote:
> From: Yunjian Wang <wangyunjian@huawei.com>
>
> Currently the driver doesn't drop a packet which can't be sent by tun
> (e.g bad packet). In this case, the driver will always process the
> same packet lead to the tx queue stuck.
>
> To fix this issue:
> 1. in the case of persistent failure (e.g bad packet), the driver
> can skip this descriptor by ignoring the error.
> 2. in the case of transient failure (e.g -EAGAIN and -ENOMEM), the
> driver schedules the worker to try again.


I might be wrong but looking at alloc_skb_with_frags(), it returns 
-ENOBUFS actually:

     *errcode = -ENOBUFS;
     skb = alloc_skb(header_len, gfp_mask);
     if (!skb)
         return NULL;

Thanks


>
> Fixes: 3a4d5c94e959 ("vhost_net: a kernel-level virtio server")
> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
> ---
>   drivers/vhost/net.c | 16 ++++++++--------
>   1 file changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index c8784dfafdd7..e76245daa7f6 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -827,14 +827,13 @@ static void handle_tx_copy(struct vhost_net *net, struct socket *sock)
>   				msg.msg_flags &= ~MSG_MORE;
>   		}
>   
> -		/* TODO: Check specific error and bomb out unless ENOBUFS? */
>   		err = sock->ops->sendmsg(sock, &msg, len);
> -		if (unlikely(err < 0)) {
> +		if (unlikely(err == -EAGAIN || err == -ENOMEM)) {
>   			vhost_discard_vq_desc(vq, 1);
>   			vhost_net_enable_vq(net, vq);
>   			break;
>   		}
> -		if (err != len)
> +		if (err >= 0 && err != len)
>   			pr_debug("Truncated TX packet: len %d != %zd\n",
>   				 err, len);
>   done:
> @@ -922,7 +921,6 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
>   			msg.msg_flags &= ~MSG_MORE;
>   		}
>   
> -		/* TODO: Check specific error and bomb out unless ENOBUFS? */
>   		err = sock->ops->sendmsg(sock, &msg, len);
>   		if (unlikely(err < 0)) {
>   			if (zcopy_used) {
> @@ -931,11 +929,13 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
>   				nvq->upend_idx = ((unsigned)nvq->upend_idx - 1)
>   					% UIO_MAXIOV;
>   			}
> -			vhost_discard_vq_desc(vq, 1);
> -			vhost_net_enable_vq(net, vq);
> -			break;
> +			if (err == -EAGAIN || err == -ENOMEM) {
> +				vhost_discard_vq_desc(vq, 1);
> +				vhost_net_enable_vq(net, vq);
> +				break;
> +			}
>   		}
> -		if (err != len)
> +		if (err >= 0 && err != len)
>   			pr_debug("Truncated TX packet: "
>   				 " len %d != %zd\n", err, len);
>   		if (!zcopy_used)
wangyunjian Dec. 24, 2020, 4:37 a.m. UTC | #3
> -----Original Message-----
> From: Jason Wang [mailto:jasowang@redhat.com]
> Sent: Thursday, December 24, 2020 11:10 AM
> To: wangyunjian <wangyunjian@huawei.com>; netdev@vger.kernel.org;
> mst@redhat.com; willemdebruijn.kernel@gmail.com
> Cc: virtualization@lists.linux-foundation.org; Lilijun (Jerry)
> <jerry.lilijun@huawei.com>; chenchanghu <chenchanghu@huawei.com>;
> xudingke <xudingke@huawei.com>; huangbin (J)
> <brian.huangbin@huawei.com>
> Subject: Re: [PATCH net v4 2/2] vhost_net: fix tx queue stuck when sendmsg
> fails
> 
> 
> On 2020/12/24 上午10:25, wangyunjian wrote:
> > From: Yunjian Wang <wangyunjian@huawei.com>
> >
> > Currently the driver doesn't drop a packet which can't be sent by tun
> > (e.g bad packet). In this case, the driver will always process the
> > same packet lead to the tx queue stuck.
> >
> > To fix this issue:
> > 1. in the case of persistent failure (e.g bad packet), the driver can
> > skip this descriptor by ignoring the error.
> > 2. in the case of transient failure (e.g -EAGAIN and -ENOMEM), the
> > driver schedules the worker to try again.
> 
> 
> I might be wrong but looking at alloc_skb_with_frags(), it returns -ENOBUFS
> actually:
> 
>      *errcode = -ENOBUFS;
>      skb = alloc_skb(header_len, gfp_mask);
>      if (!skb)
>          return NULL;

Yes, but the sock_alloc_send_pskb() returns - EAGAIN when no sndbuf space.
So there is need to check return value which is -EAGAIN or -ENOMEM or - EAGAIN?

struct sk_buff *sock_alloc_send_pskb()
{
...
	for (;;) {
...
		sk_set_bit(SOCKWQ_ASYNC_NOSPACE, sk);
		set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
		err = -EAGAIN;
		if (!timeo)
			goto failure;
...
	}
	skb = alloc_skb_with_frags(header_len, data_len, max_page_order,
				   errcode, sk->sk_allocation);
	if (skb)
		skb_set_owner_w(skb, sk);
	return skb;
...
	*errcode = err;
	return NULL;
}
> 
> Thanks
> 
> 
> >
> > Fixes: 3a4d5c94e959 ("vhost_net: a kernel-level virtio server")
> > Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
> > ---
> >   drivers/vhost/net.c | 16 ++++++++--------
> >   1 file changed, 8 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index
> > c8784dfafdd7..e76245daa7f6 100644
> > --- a/drivers/vhost/net.c
> > +++ b/drivers/vhost/net.c
> > @@ -827,14 +827,13 @@ static void handle_tx_copy(struct vhost_net *net,
> struct socket *sock)
> >   				msg.msg_flags &= ~MSG_MORE;
> >   		}
> >
> > -		/* TODO: Check specific error and bomb out unless ENOBUFS? */
> >   		err = sock->ops->sendmsg(sock, &msg, len);
> > -		if (unlikely(err < 0)) {
> > +		if (unlikely(err == -EAGAIN || err == -ENOMEM)) {
> >   			vhost_discard_vq_desc(vq, 1);
> >   			vhost_net_enable_vq(net, vq);
> >   			break;
> >   		}
> > -		if (err != len)
> > +		if (err >= 0 && err != len)
> >   			pr_debug("Truncated TX packet: len %d != %zd\n",
> >   				 err, len);
> >   done:
> > @@ -922,7 +921,6 @@ static void handle_tx_zerocopy(struct vhost_net
> *net, struct socket *sock)
> >   			msg.msg_flags &= ~MSG_MORE;
> >   		}
> >
> > -		/* TODO: Check specific error and bomb out unless ENOBUFS? */
> >   		err = sock->ops->sendmsg(sock, &msg, len);
> >   		if (unlikely(err < 0)) {
> >   			if (zcopy_used) {
> > @@ -931,11 +929,13 @@ static void handle_tx_zerocopy(struct vhost_net
> *net, struct socket *sock)
> >   				nvq->upend_idx = ((unsigned)nvq->upend_idx - 1)
> >   					% UIO_MAXIOV;
> >   			}
> > -			vhost_discard_vq_desc(vq, 1);
> > -			vhost_net_enable_vq(net, vq);
> > -			break;
> > +			if (err == -EAGAIN || err == -ENOMEM) {
> > +				vhost_discard_vq_desc(vq, 1);
> > +				vhost_net_enable_vq(net, vq);
> > +				break;
> > +			}
> >   		}
> > -		if (err != len)
> > +		if (err >= 0 && err != len)
> >   			pr_debug("Truncated TX packet: "
> >   				 " len %d != %zd\n", err, len);
> >   		if (!zcopy_used)
Jason Wang Dec. 24, 2020, 5:56 a.m. UTC | #4
On 2020/12/24 下午12:37, wangyunjian wrote:
>> -----Original Message-----
>> From: Jason Wang [mailto:jasowang@redhat.com]
>> Sent: Thursday, December 24, 2020 11:10 AM
>> To: wangyunjian <wangyunjian@huawei.com>; netdev@vger.kernel.org;
>> mst@redhat.com; willemdebruijn.kernel@gmail.com
>> Cc: virtualization@lists.linux-foundation.org; Lilijun (Jerry)
>> <jerry.lilijun@huawei.com>; chenchanghu <chenchanghu@huawei.com>;
>> xudingke <xudingke@huawei.com>; huangbin (J)
>> <brian.huangbin@huawei.com>
>> Subject: Re: [PATCH net v4 2/2] vhost_net: fix tx queue stuck when sendmsg
>> fails
>>
>>
>> On 2020/12/24 上午10:25, wangyunjian wrote:
>>> From: Yunjian Wang <wangyunjian@huawei.com>
>>>
>>> Currently the driver doesn't drop a packet which can't be sent by tun
>>> (e.g bad packet). In this case, the driver will always process the
>>> same packet lead to the tx queue stuck.
>>>
>>> To fix this issue:
>>> 1. in the case of persistent failure (e.g bad packet), the driver can
>>> skip this descriptor by ignoring the error.
>>> 2. in the case of transient failure (e.g -EAGAIN and -ENOMEM), the
>>> driver schedules the worker to try again.
>>
>> I might be wrong but looking at alloc_skb_with_frags(), it returns -ENOBUFS
>> actually:
>>
>>       *errcode = -ENOBUFS;
>>       skb = alloc_skb(header_len, gfp_mask);
>>       if (!skb)
>>           return NULL;
> Yes, but the sock_alloc_send_pskb() returns - EAGAIN when no sndbuf space.
> So there is need to check return value which is -EAGAIN or -ENOMEM or - EAGAIN?
>
> struct sk_buff *sock_alloc_send_pskb()
> {
> ...
> 	for (;;) {
> ...
> 		sk_set_bit(SOCKWQ_ASYNC_NOSPACE, sk);
> 		set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
> 		err = -EAGAIN;
> 		if (!timeo)
> 			goto failure;
> ...
> 	}
> 	skb = alloc_skb_with_frags(header_len, data_len, max_page_order,
> 				   errcode, sk->sk_allocation);
> 	if (skb)
> 		skb_set_owner_w(skb, sk);
> 	return skb;
> ...
> 	*errcode = err;
> 	return NULL;
> }


-EAGAIN and -ENOBFS are fine. But I don't see how -ENOMEM can be returned.

Thanks


>> Thanks
>>
>>
>>> Fixes: 3a4d5c94e959 ("vhost_net: a kernel-level virtio server")
>>> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
>>> ---
>>>    drivers/vhost/net.c | 16 ++++++++--------
>>>    1 file changed, 8 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index
>>> c8784dfafdd7..e76245daa7f6 100644
>>> --- a/drivers/vhost/net.c
>>> +++ b/drivers/vhost/net.c
>>> @@ -827,14 +827,13 @@ static void handle_tx_copy(struct vhost_net *net,
>> struct socket *sock)
>>>    				msg.msg_flags &= ~MSG_MORE;
>>>    		}
>>>
>>> -		/* TODO: Check specific error and bomb out unless ENOBUFS? */
>>>    		err = sock->ops->sendmsg(sock, &msg, len);
>>> -		if (unlikely(err < 0)) {
>>> +		if (unlikely(err == -EAGAIN || err == -ENOMEM)) {
>>>    			vhost_discard_vq_desc(vq, 1);
>>>    			vhost_net_enable_vq(net, vq);
>>>    			break;
>>>    		}
>>> -		if (err != len)
>>> +		if (err >= 0 && err != len)
>>>    			pr_debug("Truncated TX packet: len %d != %zd\n",
>>>    				 err, len);
>>>    done:
>>> @@ -922,7 +921,6 @@ static void handle_tx_zerocopy(struct vhost_net
>> *net, struct socket *sock)
>>>    			msg.msg_flags &= ~MSG_MORE;
>>>    		}
>>>
>>> -		/* TODO: Check specific error and bomb out unless ENOBUFS? */
>>>    		err = sock->ops->sendmsg(sock, &msg, len);
>>>    		if (unlikely(err < 0)) {
>>>    			if (zcopy_used) {
>>> @@ -931,11 +929,13 @@ static void handle_tx_zerocopy(struct vhost_net
>> *net, struct socket *sock)
>>>    				nvq->upend_idx = ((unsigned)nvq->upend_idx - 1)
>>>    					% UIO_MAXIOV;
>>>    			}
>>> -			vhost_discard_vq_desc(vq, 1);
>>> -			vhost_net_enable_vq(net, vq);
>>> -			break;
>>> +			if (err == -EAGAIN || err == -ENOMEM) {
>>> +				vhost_discard_vq_desc(vq, 1);
>>> +				vhost_net_enable_vq(net, vq);
>>> +				break;
>>> +			}
>>>    		}
>>> -		if (err != len)
>>> +		if (err >= 0 && err != len)
>>>    			pr_debug("Truncated TX packet: "
>>>    				 " len %d != %zd\n", err, len);
>>>    		if (!zcopy_used)
wangyunjian Dec. 24, 2020, 9:09 a.m. UTC | #5
> -----Original Message-----
> From: Jason Wang [mailto:jasowang@redhat.com]
> Sent: Thursday, December 24, 2020 1:56 PM
> To: wangyunjian <wangyunjian@huawei.com>
> Cc: netdev@vger.kernel.org; mst@redhat.com;
> willemdebruijn.kernel@gmail.com; virtualization@lists.linux-foundation.org;
> Lilijun (Jerry) <jerry.lilijun@huawei.com>; chenchanghu
> <chenchanghu@huawei.com>; xudingke <xudingke@huawei.com>; huangbin (J)
> <brian.huangbin@huawei.com>
> Subject: Re: [PATCH net v4 2/2] vhost_net: fix tx queue stuck when sendmsg
> fails
> 
> 
> On 2020/12/24 下午12:37, wangyunjian wrote:
> >> -----Original Message-----
> >> From: Jason Wang [mailto:jasowang@redhat.com]
> >> Sent: Thursday, December 24, 2020 11:10 AM
> >> To: wangyunjian <wangyunjian@huawei.com>; netdev@vger.kernel.org;
> >> mst@redhat.com; willemdebruijn.kernel@gmail.com
> >> Cc: virtualization@lists.linux-foundation.org; Lilijun (Jerry)
> >> <jerry.lilijun@huawei.com>; chenchanghu <chenchanghu@huawei.com>;
> >> xudingke <xudingke@huawei.com>; huangbin (J)
> >> <brian.huangbin@huawei.com>
> >> Subject: Re: [PATCH net v4 2/2] vhost_net: fix tx queue stuck when
> >> sendmsg fails
> >>
> >>
> >> On 2020/12/24 上午10:25, wangyunjian wrote:
> >>> From: Yunjian Wang <wangyunjian@huawei.com>
> >>>
> >>> Currently the driver doesn't drop a packet which can't be sent by
> >>> tun (e.g bad packet). In this case, the driver will always process
> >>> the same packet lead to the tx queue stuck.
> >>>
> >>> To fix this issue:
> >>> 1. in the case of persistent failure (e.g bad packet), the driver
> >>> can skip this descriptor by ignoring the error.
> >>> 2. in the case of transient failure (e.g -EAGAIN and -ENOMEM), the
> >>> driver schedules the worker to try again.
> >>
> >> I might be wrong but looking at alloc_skb_with_frags(), it returns
> >> -ENOBUFS
> >> actually:
> >>
> >>       *errcode = -ENOBUFS;
> >>       skb = alloc_skb(header_len, gfp_mask);
> >>       if (!skb)
> >>           return NULL;
> > Yes, but the sock_alloc_send_pskb() returns - EAGAIN when no sndbuf space.
> > So there is need to check return value which is -EAGAIN or -ENOMEM or -
> EAGAIN?
> >
> > struct sk_buff *sock_alloc_send_pskb() { ...
> > 	for (;;) {
> > ...
> > 		sk_set_bit(SOCKWQ_ASYNC_NOSPACE, sk);
> > 		set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
> > 		err = -EAGAIN;
> > 		if (!timeo)
> > 			goto failure;
> > ...
> > 	}
> > 	skb = alloc_skb_with_frags(header_len, data_len, max_page_order,
> > 				   errcode, sk->sk_allocation);
> > 	if (skb)
> > 		skb_set_owner_w(skb, sk);
> > 	return skb;
> > ...
> > 	*errcode = err;
> > 	return NULL;
> > }
> 
> 
> -EAGAIN and -ENOBFS are fine. But I don't see how -ENOMEM can be returned.

The tun_build_skb() and tun_napi_alloc_frags() return -ENOMEM when memory
allocation fails.

Thanks

> 
> Thanks
> 
> 
> >> Thanks
> >>
> >>
> >>> Fixes: 3a4d5c94e959 ("vhost_net: a kernel-level virtio server")
> >>> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
> >>> ---
> >>>    drivers/vhost/net.c | 16 ++++++++--------
> >>>    1 file changed, 8 insertions(+), 8 deletions(-)
> >>>
> >>> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index
> >>> c8784dfafdd7..e76245daa7f6 100644
> >>> --- a/drivers/vhost/net.c
> >>> +++ b/drivers/vhost/net.c
> >>> @@ -827,14 +827,13 @@ static void handle_tx_copy(struct vhost_net
> >>> *net,
> >> struct socket *sock)
> >>>    				msg.msg_flags &= ~MSG_MORE;
> >>>    		}
> >>>
> >>> -		/* TODO: Check specific error and bomb out unless ENOBUFS?
> */
> >>>    		err = sock->ops->sendmsg(sock, &msg, len);
> >>> -		if (unlikely(err < 0)) {
> >>> +		if (unlikely(err == -EAGAIN || err == -ENOMEM)) {
> >>>    			vhost_discard_vq_desc(vq, 1);
> >>>    			vhost_net_enable_vq(net, vq);
> >>>    			break;
> >>>    		}
> >>> -		if (err != len)
> >>> +		if (err >= 0 && err != len)
> >>>    			pr_debug("Truncated TX packet: len %d != %zd\n",
> >>>    				 err, len);
> >>>    done:
> >>> @@ -922,7 +921,6 @@ static void handle_tx_zerocopy(struct vhost_net
> >> *net, struct socket *sock)
> >>>    			msg.msg_flags &= ~MSG_MORE;
> >>>    		}
> >>>
> >>> -		/* TODO: Check specific error and bomb out unless ENOBUFS?
> */
> >>>    		err = sock->ops->sendmsg(sock, &msg, len);
> >>>    		if (unlikely(err < 0)) {
> >>>    			if (zcopy_used) {
> >>> @@ -931,11 +929,13 @@ static void handle_tx_zerocopy(struct
> >>> vhost_net
> >> *net, struct socket *sock)
> >>>    				nvq->upend_idx = ((unsigned)nvq->upend_idx - 1)
> >>>    					% UIO_MAXIOV;
> >>>    			}
> >>> -			vhost_discard_vq_desc(vq, 1);
> >>> -			vhost_net_enable_vq(net, vq);
> >>> -			break;
> >>> +			if (err == -EAGAIN || err == -ENOMEM) {
> >>> +				vhost_discard_vq_desc(vq, 1);
> >>> +				vhost_net_enable_vq(net, vq);
> >>> +				break;
> >>> +			}
> >>>    		}
> >>> -		if (err != len)
> >>> +		if (err >= 0 && err != len)
> >>>    			pr_debug("Truncated TX packet: "
> >>>    				 " len %d != %zd\n", err, len);
> >>>    		if (!zcopy_used)
Jason Wang Dec. 25, 2020, 6:15 a.m. UTC | #6
On 2020/12/24 下午5:09, wangyunjian wrote:
>> -EAGAIN and -ENOBFS are fine. But I don't see how -ENOMEM can be returned.
> The tun_build_skb() and tun_napi_alloc_frags() return -ENOMEM when memory
> allocation fails.
>
> Thanks
>

You're right. So I think we need check them all.

In the future, we need think of a better way. I feel such check is kind 
of fragile since people may modify core sock codes without having a look 
at what vhost does.

Thanks
diff mbox series

Patch

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index c8784dfafdd7..e76245daa7f6 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -827,14 +827,13 @@  static void handle_tx_copy(struct vhost_net *net, struct socket *sock)
 				msg.msg_flags &= ~MSG_MORE;
 		}
 
-		/* TODO: Check specific error and bomb out unless ENOBUFS? */
 		err = sock->ops->sendmsg(sock, &msg, len);
-		if (unlikely(err < 0)) {
+		if (unlikely(err == -EAGAIN || err == -ENOMEM)) {
 			vhost_discard_vq_desc(vq, 1);
 			vhost_net_enable_vq(net, vq);
 			break;
 		}
-		if (err != len)
+		if (err >= 0 && err != len)
 			pr_debug("Truncated TX packet: len %d != %zd\n",
 				 err, len);
 done:
@@ -922,7 +921,6 @@  static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
 			msg.msg_flags &= ~MSG_MORE;
 		}
 
-		/* TODO: Check specific error and bomb out unless ENOBUFS? */
 		err = sock->ops->sendmsg(sock, &msg, len);
 		if (unlikely(err < 0)) {
 			if (zcopy_used) {
@@ -931,11 +929,13 @@  static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
 				nvq->upend_idx = ((unsigned)nvq->upend_idx - 1)
 					% UIO_MAXIOV;
 			}
-			vhost_discard_vq_desc(vq, 1);
-			vhost_net_enable_vq(net, vq);
-			break;
+			if (err == -EAGAIN || err == -ENOMEM) {
+				vhost_discard_vq_desc(vq, 1);
+				vhost_net_enable_vq(net, vq);
+				break;
+			}
 		}
-		if (err != len)
+		if (err >= 0 && err != len)
 			pr_debug("Truncated TX packet: "
 				 " len %d != %zd\n", err, len);
 		if (!zcopy_used)