Message ID | 20240712115325.54175-1-leitao@debian.org (mailing list archive) |
---|---|
State | Accepted |
Commit | f8321fa75102246d7415a6af441872f6637c93ab |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net-next] virtio_net: Fix napi_skb_cache_put warning | expand |
On Fri, 12 Jul 2024 04:53:25 -0700 Breno Leitao wrote:
> Subject: [PATCH net-next] virtio_net: Fix napi_skb_cache_put warning
[PATCH net] for fixes so that the bot knows what to test against :)
No need to repost (this time).
Hello Jakub, On Fri, Jul 12, 2024 at 07:54:32AM -0700, Jakub Kicinski wrote: > On Fri, 12 Jul 2024 04:53:25 -0700 Breno Leitao wrote: > > Subject: [PATCH net-next] virtio_net: Fix napi_skb_cache_put warning > > [PATCH net] for fixes so that the bot knows what to test against :) > No need to repost (this time). I didn't send to `net` since this WARNING is only "showing" in net-next, due to commit bdacf3e34945 ("net: Use nested-BH locking for napi_alloc_cache.") being only in net-next. But you have a good point, this is a fix and it should go through `net`. sorry about it. --breno
On Fri, 12 Jul 2024 04:53:25 -0700 Breno Leitao wrote: > After the commit bdacf3e34945 ("net: Use nested-BH locking for > napi_alloc_cache.") was merged, the following warning began to appear: > > WARNING: CPU: 5 PID: 1 at net/core/skbuff.c:1451 napi_skb_cache_put+0x82/0x4b0 > > __warn+0x12f/0x340 > napi_skb_cache_put+0x82/0x4b0 > napi_skb_cache_put+0x82/0x4b0 > report_bug+0x165/0x370 > handle_bug+0x3d/0x80 > exc_invalid_op+0x1a/0x50 > asm_exc_invalid_op+0x1a/0x20 > __free_old_xmit+0x1c8/0x510 > napi_skb_cache_put+0x82/0x4b0 > __free_old_xmit+0x1c8/0x510 > __free_old_xmit+0x1c8/0x510 > __pfx___free_old_xmit+0x10/0x10 > > The issue arises because virtio is assuming it's running in NAPI context > even when it's not, such as in the netpoll case. > > To resolve this, modify virtnet_poll_tx() to only set NAPI when budget > is available. Same for virtnet_poll_cleantx(), which always assumed that > it was in a NAPI context. > > Fixes: df133f3f9625 ("virtio_net: bulk free tx skbs") > Suggested-by: Jakub Kicinski <kuba@kernel.org> > Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Jakub Kicinski <kuba@kernel.org>
On Fri, 12 Jul 2024 07:58:49 -0700 Breno Leitao wrote: > I didn't send to `net` since this WARNING is only "showing" in net-next, > due to commit bdacf3e34945 ("net: Use nested-BH locking for > napi_alloc_cache.") being only in net-next. > > But you have a good point, this is a fix and it should go through `net`. > sorry about it. Hah, but it doesn't seem to apply to net. Let's wait and see if Linus cuts final on Sunday. If he does I'll apply to net-next and you'll have to send the net version for stable to Greg. Less merge conflicts for me that way ;) If there's -rc8 please rebase.
On Fri, Jul 12, 2024 at 04:53:25AM -0700, Breno Leitao wrote: > After the commit bdacf3e34945 ("net: Use nested-BH locking for > napi_alloc_cache.") was merged, the following warning began to appear: > > WARNING: CPU: 5 PID: 1 at net/core/skbuff.c:1451 napi_skb_cache_put+0x82/0x4b0 > > __warn+0x12f/0x340 > napi_skb_cache_put+0x82/0x4b0 > napi_skb_cache_put+0x82/0x4b0 > report_bug+0x165/0x370 > handle_bug+0x3d/0x80 > exc_invalid_op+0x1a/0x50 > asm_exc_invalid_op+0x1a/0x20 > __free_old_xmit+0x1c8/0x510 > napi_skb_cache_put+0x82/0x4b0 > __free_old_xmit+0x1c8/0x510 > __free_old_xmit+0x1c8/0x510 > __pfx___free_old_xmit+0x10/0x10 > > The issue arises because virtio is assuming it's running in NAPI context > even when it's not, such as in the netpoll case. > > To resolve this, modify virtnet_poll_tx() to only set NAPI when budget > is available. Same for virtnet_poll_cleantx(), which always assumed that > it was in a NAPI context. > > Fixes: df133f3f9625 ("virtio_net: bulk free tx skbs") > Suggested-by: Jakub Kicinski <kuba@kernel.org> > Signed-off-by: Breno Leitao <leitao@debian.org> Acked-by: Michael S. Tsirkin <mst@redhat.com> though I'm not sure I understand the connection with bdacf3e34945. > --- > drivers/net/virtio_net.c | 8 ++++---- > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > index 0b4747e81464..fb1331827308 100644 > --- a/drivers/net/virtio_net.c > +++ b/drivers/net/virtio_net.c > @@ -2341,7 +2341,7 @@ static int virtnet_receive(struct receive_queue *rq, int budget, > return packets; > } > > -static void virtnet_poll_cleantx(struct receive_queue *rq) > +static void virtnet_poll_cleantx(struct receive_queue *rq, int budget) > { > struct virtnet_info *vi = rq->vq->vdev->priv; > unsigned int index = vq2rxq(rq->vq); > @@ -2359,7 +2359,7 @@ static void virtnet_poll_cleantx(struct receive_queue *rq) > > do { > virtqueue_disable_cb(sq->vq); > - free_old_xmit(sq, txq, true); > + free_old_xmit(sq, txq, !!budget); > } while (unlikely(!virtqueue_enable_cb_delayed(sq->vq))); > > if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) { > @@ -2404,7 +2404,7 @@ static int virtnet_poll(struct napi_struct *napi, int budget) > unsigned int xdp_xmit = 0; > bool napi_complete; > > - virtnet_poll_cleantx(rq); > + virtnet_poll_cleantx(rq, budget); > > received = virtnet_receive(rq, budget, &xdp_xmit); > rq->packets_in_napi += received; > @@ -2526,7 +2526,7 @@ static int virtnet_poll_tx(struct napi_struct *napi, int budget) > txq = netdev_get_tx_queue(vi->dev, index); > __netif_tx_lock(txq, raw_smp_processor_id()); > virtqueue_disable_cb(sq->vq); > - free_old_xmit(sq, txq, true); > + free_old_xmit(sq, txq, !!budget); > > if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) { > if (netif_tx_queue_stopped(txq)) { > -- > 2.43.0
On Fri, Jul 12, 2024 at 7:54 PM Breno Leitao <leitao@debian.org> wrote: > > After the commit bdacf3e34945 ("net: Use nested-BH locking for > napi_alloc_cache.") was merged, the following warning began to appear: > > WARNING: CPU: 5 PID: 1 at net/core/skbuff.c:1451 napi_skb_cache_put+0x82/0x4b0 > > __warn+0x12f/0x340 > napi_skb_cache_put+0x82/0x4b0 > napi_skb_cache_put+0x82/0x4b0 > report_bug+0x165/0x370 > handle_bug+0x3d/0x80 > exc_invalid_op+0x1a/0x50 > asm_exc_invalid_op+0x1a/0x20 > __free_old_xmit+0x1c8/0x510 > napi_skb_cache_put+0x82/0x4b0 > __free_old_xmit+0x1c8/0x510 > __free_old_xmit+0x1c8/0x510 > __pfx___free_old_xmit+0x10/0x10 > > The issue arises because virtio is assuming it's running in NAPI context > even when it's not, such as in the netpoll case. > > To resolve this, modify virtnet_poll_tx() to only set NAPI when budget > is available. Same for virtnet_poll_cleantx(), which always assumed that > it was in a NAPI context. > > Fixes: df133f3f9625 ("virtio_net: bulk free tx skbs") > Suggested-by: Jakub Kicinski <kuba@kernel.org> > Signed-off-by: Breno Leitao <leitao@debian.org> > --- Acked-by: Jason Wang <jasowang@redhat.com> Thanks
在 2024/7/12 下午7:53, Breno Leitao 写道: > After the commit bdacf3e34945 ("net: Use nested-BH locking for > napi_alloc_cache.") was merged, the following warning began to appear: > > WARNING: CPU: 5 PID: 1 at net/core/skbuff.c:1451 napi_skb_cache_put+0x82/0x4b0 > > __warn+0x12f/0x340 > napi_skb_cache_put+0x82/0x4b0 > napi_skb_cache_put+0x82/0x4b0 > report_bug+0x165/0x370 > handle_bug+0x3d/0x80 > exc_invalid_op+0x1a/0x50 > asm_exc_invalid_op+0x1a/0x20 > __free_old_xmit+0x1c8/0x510 > napi_skb_cache_put+0x82/0x4b0 > __free_old_xmit+0x1c8/0x510 > __free_old_xmit+0x1c8/0x510 > __pfx___free_old_xmit+0x10/0x10 > > The issue arises because virtio is assuming it's running in NAPI context > even when it's not, such as in the netpoll case. > > To resolve this, modify virtnet_poll_tx() to only set NAPI when budget > is available. Same for virtnet_poll_cleantx(), which always assumed that > it was in a NAPI context. > > Fixes: df133f3f9625 ("virtio_net: bulk free tx skbs") > Suggested-by: Jakub Kicinski <kuba@kernel.org> > Signed-off-by: Breno Leitao <leitao@debian.org> > --- > drivers/net/virtio_net.c | 8 ++++---- > 1 file changed, 4 insertions(+), 4 deletions(-) Reviewed-by: Heng Qi <hengqi@linux.alibaba.com> > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > index 0b4747e81464..fb1331827308 100644 > --- a/drivers/net/virtio_net.c > +++ b/drivers/net/virtio_net.c > @@ -2341,7 +2341,7 @@ static int virtnet_receive(struct receive_queue *rq, int budget, > return packets; > } > > -static void virtnet_poll_cleantx(struct receive_queue *rq) > +static void virtnet_poll_cleantx(struct receive_queue *rq, int budget) > { > struct virtnet_info *vi = rq->vq->vdev->priv; > unsigned int index = vq2rxq(rq->vq); > @@ -2359,7 +2359,7 @@ static void virtnet_poll_cleantx(struct receive_queue *rq) > > do { > virtqueue_disable_cb(sq->vq); > - free_old_xmit(sq, txq, true); > + free_old_xmit(sq, txq, !!budget); > } while (unlikely(!virtqueue_enable_cb_delayed(sq->vq))); > > if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) { > @@ -2404,7 +2404,7 @@ static int virtnet_poll(struct napi_struct *napi, int budget) > unsigned int xdp_xmit = 0; > bool napi_complete; > > - virtnet_poll_cleantx(rq); > + virtnet_poll_cleantx(rq, budget); > > received = virtnet_receive(rq, budget, &xdp_xmit); > rq->packets_in_napi += received; > @@ -2526,7 +2526,7 @@ static int virtnet_poll_tx(struct napi_struct *napi, int budget) > txq = netdev_get_tx_queue(vi->dev, index); > __netif_tx_lock(txq, raw_smp_processor_id()); > virtqueue_disable_cb(sq->vq); > - free_old_xmit(sq, txq, true); > + free_old_xmit(sq, txq, !!budget); > > if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) { > if (netif_tx_queue_stopped(txq)) {
Hello: This patch was applied to netdev/net-next.git (main) by Jakub Kicinski <kuba@kernel.org>: On Fri, 12 Jul 2024 04:53:25 -0700 you wrote: > After the commit bdacf3e34945 ("net: Use nested-BH locking for > napi_alloc_cache.") was merged, the following warning began to appear: > > WARNING: CPU: 5 PID: 1 at net/core/skbuff.c:1451 napi_skb_cache_put+0x82/0x4b0 > > __warn+0x12f/0x340 > napi_skb_cache_put+0x82/0x4b0 > napi_skb_cache_put+0x82/0x4b0 > report_bug+0x165/0x370 > handle_bug+0x3d/0x80 > exc_invalid_op+0x1a/0x50 > asm_exc_invalid_op+0x1a/0x20 > __free_old_xmit+0x1c8/0x510 > napi_skb_cache_put+0x82/0x4b0 > __free_old_xmit+0x1c8/0x510 > __free_old_xmit+0x1c8/0x510 > __pfx___free_old_xmit+0x10/0x10 > > [...] Here is the summary with links: - [net-next] virtio_net: Fix napi_skb_cache_put warning https://git.kernel.org/netdev/net-next/c/f8321fa75102 You are awesome, thank you!
Hello Michael, On Sun, Jul 14, 2024 at 03:38:42AM -0400, Michael S. Tsirkin wrote: > On Fri, Jul 12, 2024 at 04:53:25AM -0700, Breno Leitao wrote: > > After the commit bdacf3e34945 ("net: Use nested-BH locking for > > napi_alloc_cache.") was merged, the following warning began to appear: > > > > WARNING: CPU: 5 PID: 1 at net/core/skbuff.c:1451 napi_skb_cache_put+0x82/0x4b0 > > > > __warn+0x12f/0x340 > > napi_skb_cache_put+0x82/0x4b0 > > napi_skb_cache_put+0x82/0x4b0 > > report_bug+0x165/0x370 > > handle_bug+0x3d/0x80 > > exc_invalid_op+0x1a/0x50 > > asm_exc_invalid_op+0x1a/0x20 > > __free_old_xmit+0x1c8/0x510 > > napi_skb_cache_put+0x82/0x4b0 > > __free_old_xmit+0x1c8/0x510 > > __free_old_xmit+0x1c8/0x510 > > __pfx___free_old_xmit+0x10/0x10 > > > > The issue arises because virtio is assuming it's running in NAPI context > > even when it's not, such as in the netpoll case. > > > > To resolve this, modify virtnet_poll_tx() to only set NAPI when budget > > is available. Same for virtnet_poll_cleantx(), which always assumed that > > it was in a NAPI context. > > > > Fixes: df133f3f9625 ("virtio_net: bulk free tx skbs") > > Suggested-by: Jakub Kicinski <kuba@kernel.org> > > Signed-off-by: Breno Leitao <leitao@debian.org> > > Acked-by: Michael S. Tsirkin <mst@redhat.com> > > though I'm not sure I understand the connection with bdacf3e34945. The warning above appeared after bdacf3e34945 landed.
On Mon, Jul 15, 2024 at 04:25:06AM -0700, Breno Leitao wrote: > Hello Michael, > > On Sun, Jul 14, 2024 at 03:38:42AM -0400, Michael S. Tsirkin wrote: > > On Fri, Jul 12, 2024 at 04:53:25AM -0700, Breno Leitao wrote: > > > After the commit bdacf3e34945 ("net: Use nested-BH locking for > > > napi_alloc_cache.") was merged, the following warning began to appear: > > > > > > WARNING: CPU: 5 PID: 1 at net/core/skbuff.c:1451 napi_skb_cache_put+0x82/0x4b0 > > > > > > __warn+0x12f/0x340 > > > napi_skb_cache_put+0x82/0x4b0 > > > napi_skb_cache_put+0x82/0x4b0 > > > report_bug+0x165/0x370 > > > handle_bug+0x3d/0x80 > > > exc_invalid_op+0x1a/0x50 > > > asm_exc_invalid_op+0x1a/0x20 > > > __free_old_xmit+0x1c8/0x510 > > > napi_skb_cache_put+0x82/0x4b0 > > > __free_old_xmit+0x1c8/0x510 > > > __free_old_xmit+0x1c8/0x510 > > > __pfx___free_old_xmit+0x10/0x10 > > > > > > The issue arises because virtio is assuming it's running in NAPI context > > > even when it's not, such as in the netpoll case. > > > > > > To resolve this, modify virtnet_poll_tx() to only set NAPI when budget > > > is available. Same for virtnet_poll_cleantx(), which always assumed that > > > it was in a NAPI context. > > > > > > Fixes: df133f3f9625 ("virtio_net: bulk free tx skbs") > > > Suggested-by: Jakub Kicinski <kuba@kernel.org> > > > Signed-off-by: Breno Leitao <leitao@debian.org> > > > > Acked-by: Michael S. Tsirkin <mst@redhat.com> > > > > though I'm not sure I understand the connection with bdacf3e34945. > > The warning above appeared after bdacf3e34945 landed. Hi Breno, Thanks for fixing this! I think the confusion is around the fact that the commit on Fixes (df133f3f9625) tag is different from the commit in the commit message (bdacf3e34945). Please help me check if the following is correct: ### Any tree which includes df133f3f9625 ("virtio_net: bulk free tx skbs") should also include your patch, since it fixes stuff in there. The fact that the warning was only made visible in bdacf3e34945 ("net: Use nested-BH locking for napi_alloc_cache.") does not change the fact that it was already present before. Also, having bdacf3e34945 is not necessary for the backport, since it only made the bug visible. ### Are above statements right? It's important to make it clear since this helps the backporting process. Thanks! Leo
Hello Leonardo, good to see you here, On Tue, Sep 03, 2024 at 01:28:50PM -0300, Leonardo Bras wrote: > Please help me check if the following is correct: > ### > Any tree which includes df133f3f9625 ("virtio_net: bulk free tx skbs") > should also include your patch, since it fixes stuff in there. > > The fact that the warning was only made visible in > bdacf3e34945 ("net: Use nested-BH locking for napi_alloc_cache.") > does not change the fact that it was already present before. > > Also, having bdacf3e34945 is not necessary for the backport, since > it only made the bug visible. > ### > > Are above statements right? That is exactly correct. The bug was introduced by df133f3f9625 ("virtio_net: bulk free tx skbs"), but it was not visible until bdacf3e34945 ("net: Use nested-BH locking for napi_alloc_cache.") landed. You don't need bdacf3e34945 ("net: Use nested-BH locking for napi_alloc_cache.") patch backported if you don't want to. I hope it helps, --breno
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 0b4747e81464..fb1331827308 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -2341,7 +2341,7 @@ static int virtnet_receive(struct receive_queue *rq, int budget, return packets; } -static void virtnet_poll_cleantx(struct receive_queue *rq) +static void virtnet_poll_cleantx(struct receive_queue *rq, int budget) { struct virtnet_info *vi = rq->vq->vdev->priv; unsigned int index = vq2rxq(rq->vq); @@ -2359,7 +2359,7 @@ static void virtnet_poll_cleantx(struct receive_queue *rq) do { virtqueue_disable_cb(sq->vq); - free_old_xmit(sq, txq, true); + free_old_xmit(sq, txq, !!budget); } while (unlikely(!virtqueue_enable_cb_delayed(sq->vq))); if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) { @@ -2404,7 +2404,7 @@ static int virtnet_poll(struct napi_struct *napi, int budget) unsigned int xdp_xmit = 0; bool napi_complete; - virtnet_poll_cleantx(rq); + virtnet_poll_cleantx(rq, budget); received = virtnet_receive(rq, budget, &xdp_xmit); rq->packets_in_napi += received; @@ -2526,7 +2526,7 @@ static int virtnet_poll_tx(struct napi_struct *napi, int budget) txq = netdev_get_tx_queue(vi->dev, index); __netif_tx_lock(txq, raw_smp_processor_id()); virtqueue_disable_cb(sq->vq); - free_old_xmit(sq, txq, true); + free_old_xmit(sq, txq, !!budget); if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) { if (netif_tx_queue_stopped(txq)) {
After the commit bdacf3e34945 ("net: Use nested-BH locking for napi_alloc_cache.") was merged, the following warning began to appear: WARNING: CPU: 5 PID: 1 at net/core/skbuff.c:1451 napi_skb_cache_put+0x82/0x4b0 __warn+0x12f/0x340 napi_skb_cache_put+0x82/0x4b0 napi_skb_cache_put+0x82/0x4b0 report_bug+0x165/0x370 handle_bug+0x3d/0x80 exc_invalid_op+0x1a/0x50 asm_exc_invalid_op+0x1a/0x20 __free_old_xmit+0x1c8/0x510 napi_skb_cache_put+0x82/0x4b0 __free_old_xmit+0x1c8/0x510 __free_old_xmit+0x1c8/0x510 __pfx___free_old_xmit+0x10/0x10 The issue arises because virtio is assuming it's running in NAPI context even when it's not, such as in the netpoll case. To resolve this, modify virtnet_poll_tx() to only set NAPI when budget is available. Same for virtnet_poll_cleantx(), which always assumed that it was in a NAPI context. Fixes: df133f3f9625 ("virtio_net: bulk free tx skbs") Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Breno Leitao <leitao@debian.org> --- drivers/net/virtio_net.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)