Message ID | 20230308024935.91686-4-xuanzhuo@linux.alibaba.com (mailing list archive) |
---|---|
State | Accepted |
Commit | cd1c604aa1d8c641f5edcb58b76352d4eba06ec1 |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | add checking sq is full inside xdp xmit | expand |
On Wed, Mar 8, 2023 at 10:49 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote: > > If the queue of xdp xmit is not an independent queue, then when the xdp > xmit used all the desc, the xmit from the __dev_queue_xmit() may encounter > the following error. > > net ens4: Unexpected TXQ (0) queue failure: -28 > > This patch adds a check whether sq is full in xdp xmit. > > Fixes: 56434a01b12e ("virtio_net: add XDP_TX support") > Reported-by: Yichun Zhang <yichun@openresty.com> > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> > Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Thanks > --- > drivers/net/virtio_net.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > index 46bbddaadb0d..1a309cfb4976 100644 > --- a/drivers/net/virtio_net.c > +++ b/drivers/net/virtio_net.c > @@ -767,6 +767,9 @@ static int virtnet_xdp_xmit(struct net_device *dev, > } > ret = nxmit; > > + if (!is_xdp_raw_buffer_queue(vi, sq - vi->sq)) > + check_sq_full_and_disable(vi, dev, sq); > + > if (flags & XDP_XMIT_FLUSH) { > if (virtqueue_kick_prepare(sq->vq) && virtqueue_notify(sq->vq)) > kicks = 1; > -- > 2.32.0.3.g01195cf9f >
On 2023/3/8 10:49, Xuan Zhuo wrote: > If the queue of xdp xmit is not an independent queue, then when the xdp > xmit used all the desc, the xmit from the __dev_queue_xmit() may encounter > the following error. > > net ens4: Unexpected TXQ (0) queue failure: -28 > > This patch adds a check whether sq is full in xdp xmit. > > Fixes: 56434a01b12e ("virtio_net: add XDP_TX support") > Reported-by: Yichun Zhang <yichun@openresty.com> > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> > Acked-by: Michael S. Tsirkin <mst@redhat.com> > --- > drivers/net/virtio_net.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > index 46bbddaadb0d..1a309cfb4976 100644 > --- a/drivers/net/virtio_net.c > +++ b/drivers/net/virtio_net.c > @@ -767,6 +767,9 @@ static int virtnet_xdp_xmit(struct net_device *dev, > } > ret = nxmit; > > + if (!is_xdp_raw_buffer_queue(vi, sq - vi->sq)) > + check_sq_full_and_disable(vi, dev, sq); > + Sorry if I missed something obvious here. As the comment in start_xmit(), the current skb is added to the sq->vq, so NETDEV_TX_BUSY can not be returned. /* If running out of space, stop queue to avoid getting packets that we * are then unable to transmit. * An alternative would be to force queuing layer to requeue the skb by * returning NETDEV_TX_BUSY. However, NETDEV_TX_BUSY should not be * returned in a normal path of operation: it means that driver is not * maintaining the TX queue stop/start state properly, and causes * the stack to do a non-trivial amount of useless work. * Since most packets only take 1 or 2 ring slots, stopping the queue * early means 16 slots are typically wasted. */ It there any reason not to check the sq->vq->num_free at the begin of start_xmit(), if the space is not enough for the current skb, TX queue is stopped and NETDEV_TX_BUSY is return to the stack to requeue the current skb. It seems it is the pattern that most network driver follow, and it seems we can avoid calling check_sq_full_and_disable() in this patch and not wasting 16 slots as mentioned in the comment above. > if (flags & XDP_XMIT_FLUSH) { > if (virtqueue_kick_prepare(sq->vq) && virtqueue_notify(sq->vq)) > kicks = 1; >
On Wed, 8 Mar 2023 14:59:36 +0800, Yunsheng Lin <linyunsheng@huawei.com> wrote: > On 2023/3/8 10:49, Xuan Zhuo wrote: > > If the queue of xdp xmit is not an independent queue, then when the xdp > > xmit used all the desc, the xmit from the __dev_queue_xmit() may encounter > > the following error. > > > > net ens4: Unexpected TXQ (0) queue failure: -28 > > > > This patch adds a check whether sq is full in xdp xmit. > > > > Fixes: 56434a01b12e ("virtio_net: add XDP_TX support") > > Reported-by: Yichun Zhang <yichun@openresty.com> > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > > Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> > > Acked-by: Michael S. Tsirkin <mst@redhat.com> > > --- > > drivers/net/virtio_net.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > > index 46bbddaadb0d..1a309cfb4976 100644 > > --- a/drivers/net/virtio_net.c > > +++ b/drivers/net/virtio_net.c > > @@ -767,6 +767,9 @@ static int virtnet_xdp_xmit(struct net_device *dev, > > } > > ret = nxmit; > > > > + if (!is_xdp_raw_buffer_queue(vi, sq - vi->sq)) > > + check_sq_full_and_disable(vi, dev, sq); > > + > > Sorry if I missed something obvious here. > > As the comment in start_xmit(), the current skb is added to the sq->vq, so > NETDEV_TX_BUSY can not be returned. > > /* If running out of space, stop queue to avoid getting packets that we > * are then unable to transmit. > * An alternative would be to force queuing layer to requeue the skb by > * returning NETDEV_TX_BUSY. However, NETDEV_TX_BUSY should not be > * returned in a normal path of operation: it means that driver is not > * maintaining the TX queue stop/start state properly, and causes > * the stack to do a non-trivial amount of useless work. > * Since most packets only take 1 or 2 ring slots, stopping the queue > * early means 16 slots are typically wasted. > */ > > It there any reason not to check the sq->vq->num_free at the begin of start_xmit(), > if the space is not enough for the current skb, TX queue is stopped and NETDEV_TX_BUSY > is return to the stack to requeue the current skb. > > It seems it is the pattern that most network driver follow, and it seems we can avoid > calling check_sq_full_and_disable() in this patch and not wasting 16 slots as mentioned > in the comment above. > * netdev_tx_t (*ndo_start_xmit)(struct sk_buff *skb, * struct net_device *dev); * Called when a packet needs to be transmitted. * Returns NETDEV_TX_OK. Can return NETDEV_TX_BUSY, but you should stop * the queue before that can happen; it's for obsolete devices and weird * corner cases, but the stack really does a non-trivial amount * of useless work if you return NETDEV_TX_BUSY. * Required; cannot be NULL. It does not affect the XDP TX. It is just that there are some waste on the path of the protocol stack. For example, TCP will do some unnecessary work based on the return value here. Thanks. > > > if (flags & XDP_XMIT_FLUSH) { > > if (virtqueue_kick_prepare(sq->vq) && virtqueue_notify(sq->vq)) > > kicks = 1; > >
On 2023/3/8 15:14, Xuan Zhuo wrote: > On Wed, 8 Mar 2023 14:59:36 +0800, Yunsheng Lin <linyunsheng@huawei.com> wrote: >> On 2023/3/8 10:49, Xuan Zhuo wrote: >>> If the queue of xdp xmit is not an independent queue, then when the xdp >>> xmit used all the desc, the xmit from the __dev_queue_xmit() may encounter >>> the following error. >>> >>> net ens4: Unexpected TXQ (0) queue failure: -28 >>> >>> This patch adds a check whether sq is full in xdp xmit. >>> >>> Fixes: 56434a01b12e ("virtio_net: add XDP_TX support") >>> Reported-by: Yichun Zhang <yichun@openresty.com> >>> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> >>> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> >>> Acked-by: Michael S. Tsirkin <mst@redhat.com> >>> --- >>> drivers/net/virtio_net.c | 3 +++ >>> 1 file changed, 3 insertions(+) >>> >>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c >>> index 46bbddaadb0d..1a309cfb4976 100644 >>> --- a/drivers/net/virtio_net.c >>> +++ b/drivers/net/virtio_net.c >>> @@ -767,6 +767,9 @@ static int virtnet_xdp_xmit(struct net_device *dev, >>> } >>> ret = nxmit; >>> >>> + if (!is_xdp_raw_buffer_queue(vi, sq - vi->sq)) >>> + check_sq_full_and_disable(vi, dev, sq); >>> + >> >> Sorry if I missed something obvious here. >> >> As the comment in start_xmit(), the current skb is added to the sq->vq, so >> NETDEV_TX_BUSY can not be returned. >> >> /* If running out of space, stop queue to avoid getting packets that we >> * are then unable to transmit. >> * An alternative would be to force queuing layer to requeue the skb by >> * returning NETDEV_TX_BUSY. However, NETDEV_TX_BUSY should not be >> * returned in a normal path of operation: it means that driver is not >> * maintaining the TX queue stop/start state properly, and causes >> * the stack to do a non-trivial amount of useless work. >> * Since most packets only take 1 or 2 ring slots, stopping the queue >> * early means 16 slots are typically wasted. >> */ >> >> It there any reason not to check the sq->vq->num_free at the begin of start_xmit(), >> if the space is not enough for the current skb, TX queue is stopped and NETDEV_TX_BUSY >> is return to the stack to requeue the current skb. >> >> It seems it is the pattern that most network driver follow, and it seems we can avoid >> calling check_sq_full_and_disable() in this patch and not wasting 16 slots as mentioned >> in the comment above. >> > > > > * netdev_tx_t (*ndo_start_xmit)(struct sk_buff *skb, > * struct net_device *dev); > * Called when a packet needs to be transmitted. > * Returns NETDEV_TX_OK. Can return NETDEV_TX_BUSY, but you should stop > * the queue before that can happen; it's for obsolete devices and weird > * corner cases, but the stack really does a non-trivial amount > * of useless work if you return NETDEV_TX_BUSY. > * Required; cannot be NULL. Thanks for the pointer. It is intersting, it seems most driver is not flollowing the suggestion. I found out why the above comment was added, but I am not sure I understand what does "non-trivial amount of useless work" means yet. https://lists.linuxfoundation.org/pipermail/virtualization/2015-April/029718.html > > It does not affect the XDP TX. It is just that there are some waste on the > path of the protocol stack. > > For example, TCP will do some unnecessary work based on the return value here. I thought it was handled by the sched layer as below, TCP does not need to be be aware of that? Am I missed something. https://elixir.bootlin.com/linux/v6.3-rc1/source/net/sched/sch_generic.c#L362 > > Thanks. > > >> >>> if (flags & XDP_XMIT_FLUSH) { >>> if (virtqueue_kick_prepare(sq->vq) && virtqueue_notify(sq->vq)) >>> kicks = 1; >>> > > . >
On Wed, Mar 08, 2023 at 04:13:12PM +0800, Yunsheng Lin wrote: > On 2023/3/8 15:14, Xuan Zhuo wrote: > > On Wed, 8 Mar 2023 14:59:36 +0800, Yunsheng Lin <linyunsheng@huawei.com> wrote: > >> On 2023/3/8 10:49, Xuan Zhuo wrote: > >>> If the queue of xdp xmit is not an independent queue, then when the xdp > >>> xmit used all the desc, the xmit from the __dev_queue_xmit() may encounter > >>> the following error. > >>> > >>> net ens4: Unexpected TXQ (0) queue failure: -28 > >>> > >>> This patch adds a check whether sq is full in xdp xmit. > >>> > >>> Fixes: 56434a01b12e ("virtio_net: add XDP_TX support") > >>> Reported-by: Yichun Zhang <yichun@openresty.com> > >>> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > >>> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> > >>> Acked-by: Michael S. Tsirkin <mst@redhat.com> > >>> --- > >>> drivers/net/virtio_net.c | 3 +++ > >>> 1 file changed, 3 insertions(+) > >>> > >>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > >>> index 46bbddaadb0d..1a309cfb4976 100644 > >>> --- a/drivers/net/virtio_net.c > >>> +++ b/drivers/net/virtio_net.c > >>> @@ -767,6 +767,9 @@ static int virtnet_xdp_xmit(struct net_device *dev, > >>> } > >>> ret = nxmit; > >>> > >>> + if (!is_xdp_raw_buffer_queue(vi, sq - vi->sq)) > >>> + check_sq_full_and_disable(vi, dev, sq); > >>> + > >> > >> Sorry if I missed something obvious here. > >> > >> As the comment in start_xmit(), the current skb is added to the sq->vq, so > >> NETDEV_TX_BUSY can not be returned. > >> > >> /* If running out of space, stop queue to avoid getting packets that we > >> * are then unable to transmit. > >> * An alternative would be to force queuing layer to requeue the skb by > >> * returning NETDEV_TX_BUSY. However, NETDEV_TX_BUSY should not be > >> * returned in a normal path of operation: it means that driver is not > >> * maintaining the TX queue stop/start state properly, and causes > >> * the stack to do a non-trivial amount of useless work. > >> * Since most packets only take 1 or 2 ring slots, stopping the queue > >> * early means 16 slots are typically wasted. > >> */ > >> > >> It there any reason not to check the sq->vq->num_free at the begin of start_xmit(), > >> if the space is not enough for the current skb, TX queue is stopped and NETDEV_TX_BUSY > >> is return to the stack to requeue the current skb. > >> > >> It seems it is the pattern that most network driver follow, and it seems we can avoid > >> calling check_sq_full_and_disable() in this patch and not wasting 16 slots as mentioned > >> in the comment above. > >> > > > > > > > > * netdev_tx_t (*ndo_start_xmit)(struct sk_buff *skb, > > * struct net_device *dev); > > * Called when a packet needs to be transmitted. > > * Returns NETDEV_TX_OK. Can return NETDEV_TX_BUSY, but you should stop > > * the queue before that can happen; it's for obsolete devices and weird > > * corner cases, but the stack really does a non-trivial amount > > * of useless work if you return NETDEV_TX_BUSY. > > * Required; cannot be NULL. > > Thanks for the pointer. It is intersting, it seems most driver is not flollowing > the suggestion. Yes - I don't know why. > I found out why the above comment was added, but I am not sure I understand > what does "non-trivial amount of useless work" means yet. > https://lists.linuxfoundation.org/pipermail/virtualization/2015-April/029718.html dev_requeue_skb
On 2023/3/8 20:21, Michael S. Tsirkin wrote: > On Wed, Mar 08, 2023 at 04:13:12PM +0800, Yunsheng Lin wrote: >> On 2023/3/8 15:14, Xuan Zhuo wrote: >>> On Wed, 8 Mar 2023 14:59:36 +0800, Yunsheng Lin <linyunsheng@huawei.com> wrote: >>>> On 2023/3/8 10:49, Xuan Zhuo wrote: >>>>> If the queue of xdp xmit is not an independent queue, then when the xdp >>>>> xmit used all the desc, the xmit from the __dev_queue_xmit() may encounter >>>>> the following error. >>>>> >>>>> net ens4: Unexpected TXQ (0) queue failure: -28 >>>>> >>>>> This patch adds a check whether sq is full in xdp xmit. >>>>> >>>>> Fixes: 56434a01b12e ("virtio_net: add XDP_TX support") >>>>> Reported-by: Yichun Zhang <yichun@openresty.com> >>>>> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> >>>>> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> >>>>> Acked-by: Michael S. Tsirkin <mst@redhat.com> >>>>> --- >>>>> drivers/net/virtio_net.c | 3 +++ >>>>> 1 file changed, 3 insertions(+) >>>>> >>>>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c >>>>> index 46bbddaadb0d..1a309cfb4976 100644 >>>>> --- a/drivers/net/virtio_net.c >>>>> +++ b/drivers/net/virtio_net.c >>>>> @@ -767,6 +767,9 @@ static int virtnet_xdp_xmit(struct net_device *dev, >>>>> } >>>>> ret = nxmit; >>>>> >>>>> + if (!is_xdp_raw_buffer_queue(vi, sq - vi->sq)) >>>>> + check_sq_full_and_disable(vi, dev, sq); >>>>> + >>>> >>>> Sorry if I missed something obvious here. >>>> >>>> As the comment in start_xmit(), the current skb is added to the sq->vq, so >>>> NETDEV_TX_BUSY can not be returned. >>>> >>>> /* If running out of space, stop queue to avoid getting packets that we >>>> * are then unable to transmit. >>>> * An alternative would be to force queuing layer to requeue the skb by >>>> * returning NETDEV_TX_BUSY. However, NETDEV_TX_BUSY should not be >>>> * returned in a normal path of operation: it means that driver is not >>>> * maintaining the TX queue stop/start state properly, and causes >>>> * the stack to do a non-trivial amount of useless work. >>>> * Since most packets only take 1 or 2 ring slots, stopping the queue >>>> * early means 16 slots are typically wasted. >>>> */ >>>> >>>> It there any reason not to check the sq->vq->num_free at the begin of start_xmit(), >>>> if the space is not enough for the current skb, TX queue is stopped and NETDEV_TX_BUSY >>>> is return to the stack to requeue the current skb. >>>> >>>> It seems it is the pattern that most network driver follow, and it seems we can avoid >>>> calling check_sq_full_and_disable() in this patch and not wasting 16 slots as mentioned >>>> in the comment above. >>>> >>> >>> >>> >>> * netdev_tx_t (*ndo_start_xmit)(struct sk_buff *skb, >>> * struct net_device *dev); >>> * Called when a packet needs to be transmitted. >>> * Returns NETDEV_TX_OK. Can return NETDEV_TX_BUSY, but you should stop >>> * the queue before that can happen; it's for obsolete devices and weird >>> * corner cases, but the stack really does a non-trivial amount >>> * of useless work if you return NETDEV_TX_BUSY. >>> * Required; cannot be NULL. >> >> Thanks for the pointer. It is intersting, it seems most driver is not flollowing >> the suggestion. > > Yes - I don't know why. I digged a little deeper. And there was a long discussion about this. https://patchwork.ozlabs.org/project/netdev/patch/200906031247.05591.rusty@rustcorp.com.au/#70283 > >> I found out why the above comment was added, but I am not sure I understand >> what does "non-trivial amount of useless work" means yet. >> https://lists.linuxfoundation.org/pipermail/virtualization/2015-April/029718.html > > dev_requeue_skb > That's the part I did not understand, the implemention of dev_requeue_skb() does not seem "non-trivial" to me, maybe the implemention before dev_requeue_skb() was non-trivial. But now "non-trivial amount of useless work" just seems to add confusion here. Anyway, according to the discussion between Herbert and Rusty, the problems seems to be: 1. tcpdump may see the packet twice due to requeuing. 2. higher priority skb is not able to preempt the requeued low priority skb. For 1, it seems fixable. For 2, I am not a QoS expert, but does the big TCP support added recently and the batching support in dev_hard_start_xmit() adds more noise to this qos problem?
On Wed, 8 Mar 2023 07:21:07 -0500 Michael S. Tsirkin wrote: > > > * netdev_tx_t (*ndo_start_xmit)(struct sk_buff *skb, > > > * struct net_device *dev); > > > * Called when a packet needs to be transmitted. > > > * Returns NETDEV_TX_OK. Can return NETDEV_TX_BUSY, but you should stop > > > * the queue before that can happen; it's for obsolete devices and weird > > > * corner cases, but the stack really does a non-trivial amount > > > * of useless work if you return NETDEV_TX_BUSY. > > > * Required; cannot be NULL. > > > > Thanks for the pointer. It is intersting, it seems most driver is not flollowing > > the suggestion. > > Yes - I don't know why. Most modern drivers don't stop the queue upfront? We try to catch it in review, so anything remotely recent should. But a lot of people DTRT *and* check at the start of .xmit if the skb fits (return BUSY if it doesn't) - defensive programming. But the BUSY path should never hit.
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 46bbddaadb0d..1a309cfb4976 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -767,6 +767,9 @@ static int virtnet_xdp_xmit(struct net_device *dev, } ret = nxmit; + if (!is_xdp_raw_buffer_queue(vi, sq - vi->sq)) + check_sq_full_and_disable(vi, dev, sq); + if (flags & XDP_XMIT_FLUSH) { if (virtqueue_kick_prepare(sq->vq) && virtqueue_notify(sq->vq)) kicks = 1;