[RFC,v2,2/4] virtio_net: disable cb aggressively

Message ID	20210413054733.36363-3-mst@redhat.com (mailing list archive)
State	RFC
Delegated to:	Netdev Maintainers
Headers	show Return-Path: <netdev-owner@kernel.org> Date: Tue, 13 Apr 2021 01:47:49 -0400 From: "Michael S. Tsirkin" <mst@redhat.com> To: linux-kernel@vger.kernel.org Cc: Jakub Kicinski <kuba@kernel.org>, Jason Wang <jasowang@redhat.com>, Wei Wang <weiwan@google.com>, David Miller <davem@davemloft.net>, netdev@vger.kernel.org, Willem de Bruijn <willemb@google.com>, virtualization@lists.linux-foundation.org Subject: [PATCH RFC v2 2/4] virtio_net: disable cb aggressively Message-ID: <20210413054733.36363-3-mst@redhat.com> References: <20210413054733.36363-1-mst@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210413054733.36363-1-mst@redhat.com> Precedence: bulk
Series	virtio net: spurious interrupt related fixes \| expand [RFC,v2,0/4] virtio net: spurious interrupt related fixes [RFC,v2,1/4] virtio: fix up virtio_disable_cb [RFC,v2,2/4] virtio_net: disable cb aggressively [RFC,v2,3/4] virtio_net: move tx vq operation under tx queue lock [RFC,v2,4/4] virtio_net: move txq wakeups under tx q lock

Message ID

20210413054733.36363-3-mst@redhat.com (mailing list archive)

State

RFC

Delegated to:

Netdev Maintainers

Headers

Date: Tue, 13 Apr 2021 01:47:49 -0400
From: "Michael S. Tsirkin" <mst@redhat.com>
To: linux-kernel@vger.kernel.org
Cc: Jakub Kicinski <kuba@kernel.org>, Jason Wang <jasowang@redhat.com>,
        Wei Wang <weiwan@google.com>,
        David Miller <davem@davemloft.net>, netdev@vger.kernel.org,
        Willem de Bruijn <willemb@google.com>,
        virtualization@lists.linux-foundation.org
Subject: [PATCH RFC v2 2/4] virtio_net: disable cb aggressively
Message-ID: <20210413054733.36363-3-mst@redhat.com>
References: <20210413054733.36363-1-mst@redhat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20210413054733.36363-1-mst@redhat.com>
Precedence: bulk

Series

virtio net: spurious interrupt related fixes | expand

Checks

Context	Check	Description
netdev/cover_letter	success	Link
netdev/fixes_present	success	Link
netdev/patch_count	success	Link
netdev/tree_selection	success	Guessed tree name to be net-next
netdev/subject_prefix	success	Link
netdev/cc_maintainers	success	CCed 6 of 6 maintainers
netdev/source_inline	success	Was 0 now: 0
netdev/verify_signedoff	success	Link
netdev/module_param	success	Was 0 now: 0
netdev/build_32bit	success	Errors and warnings before: 0 this patch: 0
netdev/kdoc	success	Errors and warnings before: 0 this patch: 0
netdev/verify_fixes	success	Link
netdev/checkpatch	success	total: 0 errors, 0 warnings, 0 checks, 14 lines checked
netdev/build_allmodconfig_warn	success	Errors and warnings before: 0 this patch: 0
netdev/header_inline	success	Link

Context

Check

Description

netdev/cover_letter

success

Link

netdev/fixes_present

success

Link

netdev/patch_count

success

Link

netdev/tree_selection

success

Guessed tree name to be net-next

netdev/subject_prefix

success

Link

netdev/cc_maintainers

success

CCed 6 of 6 maintainers

netdev/source_inline

success

Was 0 now: 0

netdev/verify_signedoff

success

Link

netdev/module_param

success

Was 0 now: 0

netdev/build_32bit

success

Errors and warnings before: 0 this patch: 0

netdev/kdoc

success

Errors and warnings before: 0 this patch: 0

netdev/verify_fixes

success

Link

netdev/checkpatch

success

total: 0 errors, 0 warnings, 0 checks, 14 lines checked

netdev/build_allmodconfig_warn

success

Errors and warnings before: 0 this patch: 0

netdev/header_inline

success

Link

Commit Message

Michael S. Tsirkin April 13, 2021, 5:47 a.m. UTC

There are currently two cases where we poll TX vq not in response to a
callback: start xmit and rx napi.  We currently do this with callbacks
enabled which can cause extra interrupts from the card.  Used not to be
a big issue as we run with interrupts disabled but that is no longer the
case, and in some cases the rate of spurious interrupts is so high
linux detects this and actually kills the interrupt.

Fix up by disabling the callbacks before polling the tx vq.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/net/virtio_net.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Jason Wang April 13, 2021, 8:53 a.m. UTC | #1

在 2021/4/13 下午1:47, Michael S. Tsirkin 写道:
> There are currently two cases where we poll TX vq not in response to a
> callback: start xmit and rx napi.  We currently do this with callbacks
> enabled which can cause extra interrupts from the card.  Used not to be
> a big issue as we run with interrupts disabled but that is no longer the
> case, and in some cases the rate of spurious interrupts is so high
> linux detects this and actually kills the interrupt.
>
> Fix up by disabling the callbacks before polling the tx vq.
>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>   drivers/net/virtio_net.c | 2 ++
>   1 file changed, 2 insertions(+)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 82e520d2cb12..16d5abed582c 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -1429,6 +1429,7 @@ static void virtnet_poll_cleantx(struct receive_queue *rq)
>   		return;
>   
>   	if (__netif_tx_trylock(txq)) {
> +		virtqueue_disable_cb(sq->vq);
>   		free_old_xmit_skbs(sq, true);
>   		__netif_tx_unlock(txq);


Any reason that we don't need to enable the cb here?

And as we discussed in the past, it's probably the time to have a single 
NAPI for both tx and rx?

Thanks


>   	}
> @@ -1582,6 +1583,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
>   	bool use_napi = sq->napi.weight;
>   
>   	/* Free up any pending old buffers before queueing new ones. */
> +	virtqueue_disable_cb(sq->vq);
>   	free_old_xmit_skbs(sq, false);
>   
>   	if (use_napi && kick)

Willem de Bruijn April 13, 2021, 2:08 p.m. UTC | #2

On Tue, Apr 13, 2021 at 4:53 AM Jason Wang <jasowang@redhat.com> wrote:
>
>
> 在 2021/4/13 下午1:47, Michael S. Tsirkin 写道:
> > There are currently two cases where we poll TX vq not in response to a
> > callback: start xmit and rx napi.  We currently do this with callbacks
> > enabled which can cause extra interrupts from the card.  Used not to be
> > a big issue as we run with interrupts disabled but that is no longer the
> > case, and in some cases the rate of spurious interrupts is so high
> > linux detects this and actually kills the interrupt.
> >
> > Fix up by disabling the callbacks before polling the tx vq.
> >
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> >   drivers/net/virtio_net.c | 2 ++
> >   1 file changed, 2 insertions(+)
> >
> > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > index 82e520d2cb12..16d5abed582c 100644
> > --- a/drivers/net/virtio_net.c
> > +++ b/drivers/net/virtio_net.c
> > @@ -1429,6 +1429,7 @@ static void virtnet_poll_cleantx(struct receive_queue *rq)
> >               return;
> >
> >       if (__netif_tx_trylock(txq)) {
> > +             virtqueue_disable_cb(sq->vq);
> >               free_old_xmit_skbs(sq, true);
> >               __netif_tx_unlock(txq);
>
>
> Any reason that we don't need to enable the cb here?

This is an opportunistic clean outside the normal tx-napi path, so if
disabling the tx interrupt here, it won't be reenabled based on
napi_complete_done.

I think that means that it stays disabled until the following start_xmit:

        if (use_napi && kick)
                virtqueue_enable_cb_delayed(sq->vq);

But that seems sufficient.

Michael S. Tsirkin April 13, 2021, 2:33 p.m. UTC | #3

On Tue, Apr 13, 2021 at 04:53:32PM +0800, Jason Wang wrote:
> 
> 在 2021/4/13 下午1:47, Michael S. Tsirkin 写道:
> > There are currently two cases where we poll TX vq not in response to a
> > callback: start xmit and rx napi.  We currently do this with callbacks
> > enabled which can cause extra interrupts from the card.  Used not to be
> > a big issue as we run with interrupts disabled but that is no longer the
> > case, and in some cases the rate of spurious interrupts is so high
> > linux detects this and actually kills the interrupt.
> > 
> > Fix up by disabling the callbacks before polling the tx vq.
> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> >   drivers/net/virtio_net.c | 2 ++
> >   1 file changed, 2 insertions(+)
> > 
> > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > index 82e520d2cb12..16d5abed582c 100644
> > --- a/drivers/net/virtio_net.c
> > +++ b/drivers/net/virtio_net.c
> > @@ -1429,6 +1429,7 @@ static void virtnet_poll_cleantx(struct receive_queue *rq)
> >   		return;
> >   	if (__netif_tx_trylock(txq)) {
> > +		virtqueue_disable_cb(sq->vq);
> >   		free_old_xmit_skbs(sq, true);
> >   		__netif_tx_unlock(txq);
> 
> 
> Any reason that we don't need to enable the cb here?

Good point ... probably only if the vq isn't empty ...

> And as we discussed in the past, it's probably the time to have a single
> NAPI for both tx and rx?
> 
> Thanks


Donnu. I'd like to get a minimal bugfix in, refactoring on top ...

> 
> >   	}
> > @@ -1582,6 +1583,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
> >   	bool use_napi = sq->napi.weight;
> >   	/* Free up any pending old buffers before queueing new ones. */
> > +	virtqueue_disable_cb(sq->vq);
> >   	free_old_xmit_skbs(sq, false);
> >   	if (use_napi && kick)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 82e520d2cb12..16d5abed582c 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1429,6 +1429,7 @@  static void virtnet_poll_cleantx(struct receive_queue *rq)
 		return;
 
 	if (__netif_tx_trylock(txq)) {
+		virtqueue_disable_cb(sq->vq);
 		free_old_xmit_skbs(sq, true);
 		__netif_tx_unlock(txq);
 	}
@@ -1582,6 +1583,7 @@  static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
 	bool use_napi = sq->napi.weight;
 
 	/* Free up any pending old buffers before queueing new ones. */
+	virtqueue_disable_cb(sq->vq);
 	free_old_xmit_skbs(sq, false);
 
 	if (use_napi && kick)