[net-next] virtio_net: Add TX stop and wake counters

Message ID	20240130142521.18593-1-danielj@nvidia.com (mailing list archive)
State	Changes Requested
Delegated to:	Netdev Maintainers
Headers	show Received: from NAM10-BN7-obe.outbound.protection.outlook.com (mail-bn7nam10on2065.outbound.protection.outlook.com [40.107.92.65]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 77F486D1C3 for <netdev@vger.kernel.org>; Tue, 30 Jan 2024 14:25:57 +0000 (UTC) Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.117.160 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.117.160; helo=mail.nvidia.com; pr=C From: Daniel Jurgens <danielj@nvidia.com> To: <netdev@vger.kernel.org> CC: <mst@redhat.com>, <jasowang@redhat.com>, <xuanzhuo@linux.alibaba.com>, <virtualization@lists.linux.dev>, <davem@davemloft.net>, <edumazet@google.com>, <kuba@kernel.org>, <abeni@redhat.com>, Daniel Jurgens <danielj@nvidia.com>, Parav Pandit <parav@nvidia.com> Subject: [PATCH net-next] virtio_net: Add TX stop and wake counters Date: Tue, 30 Jan 2024 08:25:21 -0600 Message-ID: <20240130142521.18593-1-danielj@nvidia.com> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain
Series	[net-next] virtio_net: Add TX stop and wake counters \| expand [net-next] virtio_net: Add TX stop and wake counters

Context	Check	Description
netdev/series_format	success	Single patches do not need cover letters
netdev/tree_selection	success	Clearly marked for net-next
netdev/ynl	success	Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present	success	Fixes tag not required for -next series
netdev/header_inline	success	No static functions without inline keyword in header files
netdev/build_32bit	success	Errors and warnings before: 1048 this patch: 1048
netdev/build_tools	success	No tools touched, skip
netdev/cc_maintainers	success	CCed 0 of 0 maintainers
netdev/build_clang	success	Errors and warnings before: 1065 this patch: 1065
netdev/verify_signedoff	success	Signed-off-by tag matches author and committer
netdev/deprecated_api	success	None detected
netdev/check_selftest	success	No net selftest shell script
netdev/verify_fixes	success	No Fixes tag
netdev/build_allmodconfig_warn	success	Errors and warnings before: 1065 this patch: 1065
netdev/checkpatch	success	total: 0 errors, 0 warnings, 0 checks, 64 lines checked
netdev/build_clang_rust	success	No Rust files in patch. Skipping build
netdev/kdoc	success	Errors and warnings before: 0 this patch: 0
netdev/source_inline	success	Was 0 now: 0
netdev/contest	success	net-next-2024-02-01--18-00 (tests: 717)

Dan Jurgens Jan. 30, 2024, 2:25 p.m. UTC

Add a tx queue stop and wake counters, they are useful for debugging.

	$ ethtool -S ens5f2 | grep 'tx_stop\|tx_wake'
	...
	tx_queue_1_tx_stop: 16726
	tx_queue_1_tx_wake: 16726
	...
	tx_queue_8_tx_stop: 1500110
	tx_queue_8_tx_wake: 1500110

Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
---
 drivers/net/virtio_net.c | 26 ++++++++++++++++++++++++--
 1 file changed, 24 insertions(+), 2 deletions(-)

Michael S. Tsirkin Jan. 30, 2024, 2:58 p.m. UTC | #1

On Tue, Jan 30, 2024 at 08:25:21AM -0600, Daniel Jurgens wrote:
> Add a tx queue stop and wake counters, they are useful for debugging.
> 
> 	$ ethtool -S ens5f2 | grep 'tx_stop\|tx_wake'
> 	...
> 	tx_queue_1_tx_stop: 16726
> 	tx_queue_1_tx_wake: 16726
> 	...
> 	tx_queue_8_tx_stop: 1500110
> 	tx_queue_8_tx_wake: 1500110
> 
> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> Reviewed-by: Parav Pandit <parav@nvidia.com>

Hmm isn't one always same as the other, except when queue is stopped?
And when it is stopped you can see that in the status?
So how is having two useful?


> ---
>  drivers/net/virtio_net.c | 26 ++++++++++++++++++++++++--
>  1 file changed, 24 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 3cb8aa193884..7e3c31ceaf7e 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -88,6 +88,8 @@ struct virtnet_sq_stats {
>  	u64_stats_t xdp_tx_drops;
>  	u64_stats_t kicks;
>  	u64_stats_t tx_timeouts;
> +	u64_stats_t tx_stop;
> +	u64_stats_t tx_wake;
>  };
>  
>  struct virtnet_rq_stats {
> @@ -112,6 +114,8 @@ static const struct virtnet_stat_desc virtnet_sq_stats_desc[] = {
>  	{ "xdp_tx_drops",	VIRTNET_SQ_STAT(xdp_tx_drops) },
>  	{ "kicks",		VIRTNET_SQ_STAT(kicks) },
>  	{ "tx_timeouts",	VIRTNET_SQ_STAT(tx_timeouts) },
> +	{ "tx_stop",		VIRTNET_SQ_STAT(tx_stop) },
> +	{ "tx_wake",		VIRTNET_SQ_STAT(tx_wake) },
>  };
>  
>  static const struct virtnet_stat_desc virtnet_rq_stats_desc[] = {
> @@ -843,6 +847,9 @@ static void check_sq_full_and_disable(struct virtnet_info *vi,
>  	 */
>  	if (sq->vq->num_free < 2+MAX_SKB_FRAGS) {
>  		netif_stop_subqueue(dev, qnum);
> +		u64_stats_update_begin(&sq->stats.syncp);
> +		u64_stats_inc(&sq->stats.tx_stop);
> +		u64_stats_update_end(&sq->stats.syncp);
>  		if (use_napi) {
>  			if (unlikely(!virtqueue_enable_cb_delayed(sq->vq)))
>  				virtqueue_napi_schedule(&sq->napi, sq->vq);
> @@ -851,6 +858,9 @@ static void check_sq_full_and_disable(struct virtnet_info *vi,
>  			free_old_xmit_skbs(sq, false);
>  			if (sq->vq->num_free >= 2+MAX_SKB_FRAGS) {
>  				netif_start_subqueue(dev, qnum);
> +				u64_stats_update_begin(&sq->stats.syncp);
> +				u64_stats_inc(&sq->stats.tx_wake);
> +				u64_stats_update_end(&sq->stats.syncp);
>  				virtqueue_disable_cb(sq->vq);
>  			}
>  		}
> @@ -2163,8 +2173,14 @@ static void virtnet_poll_cleantx(struct receive_queue *rq)
>  			free_old_xmit_skbs(sq, true);
>  		} while (unlikely(!virtqueue_enable_cb_delayed(sq->vq)));
>  
> -		if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS)
> +		if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) {
> +			if (netif_tx_queue_stopped(txq)) {
> +				u64_stats_update_begin(&sq->stats.syncp);
> +				u64_stats_inc(&sq->stats.tx_wake);
> +				u64_stats_update_end(&sq->stats.syncp);
> +			}
>  			netif_tx_wake_queue(txq);
> +		}
>  
>  		__netif_tx_unlock(txq);
>  	}
> @@ -2310,8 +2326,14 @@ static int virtnet_poll_tx(struct napi_struct *napi, int budget)
>  	virtqueue_disable_cb(sq->vq);
>  	free_old_xmit_skbs(sq, true);
>  
> -	if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS)
> +	if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) {
> +		if (netif_tx_queue_stopped(txq)) {
> +			u64_stats_update_begin(&sq->stats.syncp);
> +			u64_stats_inc(&sq->stats.tx_wake);
> +			u64_stats_update_end(&sq->stats.syncp);
> +		}
>  		netif_tx_wake_queue(txq);
> +	}
>  
>  	opaque = virtqueue_enable_cb_prepare(sq->vq);
>  
> -- 
> 2.42.0

Heng Qi Jan. 30, 2024, 3:16 p.m. UTC | #2

在 2024/1/30 下午10:25, Daniel Jurgens 写道:
> Add a tx queue stop and wake counters, they are useful for debugging.
>
> 	$ ethtool -S ens5f2 | grep 'tx_stop\|tx_wake'
> 	...
> 	tx_queue_1_tx_stop: 16726
> 	tx_queue_1_tx_wake: 16726
> 	...
> 	tx_queue_8_tx_stop: 1500110
> 	tx_queue_8_tx_wake: 1500110
>
> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> ---
>   drivers/net/virtio_net.c | 26 ++++++++++++++++++++++++--
>   1 file changed, 24 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 3cb8aa193884..7e3c31ceaf7e 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -88,6 +88,8 @@ struct virtnet_sq_stats {
>   	u64_stats_t xdp_tx_drops;
>   	u64_stats_t kicks;
>   	u64_stats_t tx_timeouts;
> +	u64_stats_t tx_stop;
> +	u64_stats_t tx_wake;
>   };

Hi Daniel!

tx_stop/wake only counts the status in the I/O path.
Do the status of virtnet_config_changed_work and virtnet_tx_resize need 
to be counted?

Thanks,
Heng

>   
>   struct virtnet_rq_stats {
> @@ -112,6 +114,8 @@ static const struct virtnet_stat_desc virtnet_sq_stats_desc[] = {
>   	{ "xdp_tx_drops",	VIRTNET_SQ_STAT(xdp_tx_drops) },
>   	{ "kicks",		VIRTNET_SQ_STAT(kicks) },
>   	{ "tx_timeouts",	VIRTNET_SQ_STAT(tx_timeouts) },
> +	{ "tx_stop",		VIRTNET_SQ_STAT(tx_stop) },
> +	{ "tx_wake",		VIRTNET_SQ_STAT(tx_wake) },
>   };
>   
>   static const struct virtnet_stat_desc virtnet_rq_stats_desc[] = {
> @@ -843,6 +847,9 @@ static void check_sq_full_and_disable(struct virtnet_info *vi,
>   	 */
>   	if (sq->vq->num_free < 2+MAX_SKB_FRAGS) {
>   		netif_stop_subqueue(dev, qnum);
> +		u64_stats_update_begin(&sq->stats.syncp);
> +		u64_stats_inc(&sq->stats.tx_stop);
> +		u64_stats_update_end(&sq->stats.syncp);
>   		if (use_napi) {
>   			if (unlikely(!virtqueue_enable_cb_delayed(sq->vq)))
>   				virtqueue_napi_schedule(&sq->napi, sq->vq);
> @@ -851,6 +858,9 @@ static void check_sq_full_and_disable(struct virtnet_info *vi,
>   			free_old_xmit_skbs(sq, false);
>   			if (sq->vq->num_free >= 2+MAX_SKB_FRAGS) {
>   				netif_start_subqueue(dev, qnum);
> +				u64_stats_update_begin(&sq->stats.syncp);
> +				u64_stats_inc(&sq->stats.tx_wake);
> +				u64_stats_update_end(&sq->stats.syncp);
>   				virtqueue_disable_cb(sq->vq);
>   			}
>   		}
> @@ -2163,8 +2173,14 @@ static void virtnet_poll_cleantx(struct receive_queue *rq)
>   			free_old_xmit_skbs(sq, true);
>   		} while (unlikely(!virtqueue_enable_cb_delayed(sq->vq)));
>   
> -		if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS)
> +		if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) {
> +			if (netif_tx_queue_stopped(txq)) {
> +				u64_stats_update_begin(&sq->stats.syncp);
> +				u64_stats_inc(&sq->stats.tx_wake);
> +				u64_stats_update_end(&sq->stats.syncp);
> +			}
>   			netif_tx_wake_queue(txq);
> +		}
>   
>   		__netif_tx_unlock(txq);
>   	}
> @@ -2310,8 +2326,14 @@ static int virtnet_poll_tx(struct napi_struct *napi, int budget)
>   	virtqueue_disable_cb(sq->vq);
>   	free_old_xmit_skbs(sq, true);
>   
> -	if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS)
> +	if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) {
> +		if (netif_tx_queue_stopped(txq)) {
> +			u64_stats_update_begin(&sq->stats.syncp);
> +			u64_stats_inc(&sq->stats.tx_wake);
> +			u64_stats_update_end(&sq->stats.syncp);
> +		}
>   		netif_tx_wake_queue(txq);
> +	}
>   
>   	opaque = virtqueue_enable_cb_prepare(sq->vq);
>

Dan Jurgens Jan. 30, 2024, 3:40 p.m. UTC | #3

> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Tuesday, January 30, 2024 8:58 AM
> 
> On Tue, Jan 30, 2024 at 08:25:21AM -0600, Daniel Jurgens wrote:
> > Add a tx queue stop and wake counters, they are useful for debugging.
> >
> > 	$ ethtool -S ens5f2 | grep 'tx_stop\|tx_wake'
> > 	...
> > 	tx_queue_1_tx_stop: 16726
> > 	tx_queue_1_tx_wake: 16726
> > 	...
> > 	tx_queue_8_tx_stop: 1500110
> > 	tx_queue_8_tx_wake: 1500110
> >
> > Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> > Reviewed-by: Parav Pandit <parav@nvidia.com>
> 
> Hmm isn't one always same as the other, except when queue is stopped?
> And when it is stopped you can see that in the status?
> So how is having two useful?

At idle the counters will be the same, unless a tx_timeout occurs. But under load they can be monitored to see which queues are stopped and get an idea of how long they are stopped.

Other net drivers (not all), also have the wake counter. In my opinion it makes the stop counter more useful, at little cost.

> 
> 
> > ---
> >  drivers/net/virtio_net.c | 26 ++++++++++++++++++++++++--
> >  1 file changed, 24 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index
> > 3cb8aa193884..7e3c31ceaf7e 100644
> > --- a/drivers/net/virtio_net.c
> > +++ b/drivers/net/virtio_net.c
> > @@ -88,6 +88,8 @@ struct virtnet_sq_stats {
> >  	u64_stats_t xdp_tx_drops;
> >  	u64_stats_t kicks;
> >  	u64_stats_t tx_timeouts;
> > +	u64_stats_t tx_stop;
> > +	u64_stats_t tx_wake;
> >  };
> >
> >  struct virtnet_rq_stats {
> > @@ -112,6 +114,8 @@ static const struct virtnet_stat_desc
> virtnet_sq_stats_desc[] = {
> >  	{ "xdp_tx_drops",	VIRTNET_SQ_STAT(xdp_tx_drops) },
> >  	{ "kicks",		VIRTNET_SQ_STAT(kicks) },
> >  	{ "tx_timeouts",	VIRTNET_SQ_STAT(tx_timeouts) },
> > +	{ "tx_stop",		VIRTNET_SQ_STAT(tx_stop) },
> > +	{ "tx_wake",		VIRTNET_SQ_STAT(tx_wake) },
> >  };
> >
> >  static const struct virtnet_stat_desc virtnet_rq_stats_desc[] = { @@
> > -843,6 +847,9 @@ static void check_sq_full_and_disable(struct virtnet_info
> *vi,
> >  	 */
> >  	if (sq->vq->num_free < 2+MAX_SKB_FRAGS) {
> >  		netif_stop_subqueue(dev, qnum);
> > +		u64_stats_update_begin(&sq->stats.syncp);
> > +		u64_stats_inc(&sq->stats.tx_stop);
> > +		u64_stats_update_end(&sq->stats.syncp);
> >  		if (use_napi) {
> >  			if (unlikely(!virtqueue_enable_cb_delayed(sq->vq)))
> >  				virtqueue_napi_schedule(&sq->napi, sq-
> >vq); @@ -851,6 +858,9 @@
> > static void check_sq_full_and_disable(struct virtnet_info *vi,
> >  			free_old_xmit_skbs(sq, false);
> >  			if (sq->vq->num_free >= 2+MAX_SKB_FRAGS) {
> >  				netif_start_subqueue(dev, qnum);
> > +				u64_stats_update_begin(&sq->stats.syncp);
> > +				u64_stats_inc(&sq->stats.tx_wake);
> > +				u64_stats_update_end(&sq->stats.syncp);
> >  				virtqueue_disable_cb(sq->vq);
> >  			}
> >  		}
> > @@ -2163,8 +2173,14 @@ static void virtnet_poll_cleantx(struct
> receive_queue *rq)
> >  			free_old_xmit_skbs(sq, true);
> >  		} while (unlikely(!virtqueue_enable_cb_delayed(sq->vq)));
> >
> > -		if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS)
> > +		if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) {
> > +			if (netif_tx_queue_stopped(txq)) {
> > +				u64_stats_update_begin(&sq->stats.syncp);
> > +				u64_stats_inc(&sq->stats.tx_wake);
> > +				u64_stats_update_end(&sq->stats.syncp);
> > +			}
> >  			netif_tx_wake_queue(txq);
> > +		}
> >
> >  		__netif_tx_unlock(txq);
> >  	}
> > @@ -2310,8 +2326,14 @@ static int virtnet_poll_tx(struct napi_struct
> *napi, int budget)
> >  	virtqueue_disable_cb(sq->vq);
> >  	free_old_xmit_skbs(sq, true);
> >
> > -	if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS)
> > +	if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) {
> > +		if (netif_tx_queue_stopped(txq)) {
> > +			u64_stats_update_begin(&sq->stats.syncp);
> > +			u64_stats_inc(&sq->stats.tx_wake);
> > +			u64_stats_update_end(&sq->stats.syncp);
> > +		}
> >  		netif_tx_wake_queue(txq);
> > +	}
> >
> >  	opaque = virtqueue_enable_cb_prepare(sq->vq);
> >
> > --
> > 2.42.0

Michael S. Tsirkin Jan. 30, 2024, 3:41 p.m. UTC | #4

On Tue, Jan 30, 2024 at 03:40:21PM +0000, Daniel Jurgens wrote:
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Tuesday, January 30, 2024 8:58 AM
> > 
> > On Tue, Jan 30, 2024 at 08:25:21AM -0600, Daniel Jurgens wrote:
> > > Add a tx queue stop and wake counters, they are useful for debugging.
> > >
> > > 	$ ethtool -S ens5f2 | grep 'tx_stop\|tx_wake'
> > > 	...
> > > 	tx_queue_1_tx_stop: 16726
> > > 	tx_queue_1_tx_wake: 16726
> > > 	...
> > > 	tx_queue_8_tx_stop: 1500110
> > > 	tx_queue_8_tx_wake: 1500110
> > >
> > > Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> > > Reviewed-by: Parav Pandit <parav@nvidia.com>
> > 
> > Hmm isn't one always same as the other, except when queue is stopped?
> > And when it is stopped you can see that in the status?
> > So how is having two useful?
> 
> At idle the counters will be the same, unless a tx_timeout occurs. But under load they can be monitored to see which queues are stopped and get an idea of how long they are stopped.

how does it give you the idea of how long they are stopped?

> Other net drivers (not all), also have the wake counter.

Examples?

> In my opinion it makes the stop counter more useful, at little cost.
> 
> > 
> > 
> > > ---
> > >  drivers/net/virtio_net.c | 26 ++++++++++++++++++++++++--
> > >  1 file changed, 24 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index
> > > 3cb8aa193884..7e3c31ceaf7e 100644
> > > --- a/drivers/net/virtio_net.c
> > > +++ b/drivers/net/virtio_net.c
> > > @@ -88,6 +88,8 @@ struct virtnet_sq_stats {
> > >  	u64_stats_t xdp_tx_drops;
> > >  	u64_stats_t kicks;
> > >  	u64_stats_t tx_timeouts;
> > > +	u64_stats_t tx_stop;
> > > +	u64_stats_t tx_wake;
> > >  };
> > >
> > >  struct virtnet_rq_stats {
> > > @@ -112,6 +114,8 @@ static const struct virtnet_stat_desc
> > virtnet_sq_stats_desc[] = {
> > >  	{ "xdp_tx_drops",	VIRTNET_SQ_STAT(xdp_tx_drops) },
> > >  	{ "kicks",		VIRTNET_SQ_STAT(kicks) },
> > >  	{ "tx_timeouts",	VIRTNET_SQ_STAT(tx_timeouts) },
> > > +	{ "tx_stop",		VIRTNET_SQ_STAT(tx_stop) },
> > > +	{ "tx_wake",		VIRTNET_SQ_STAT(tx_wake) },
> > >  };
> > >
> > >  static const struct virtnet_stat_desc virtnet_rq_stats_desc[] = { @@
> > > -843,6 +847,9 @@ static void check_sq_full_and_disable(struct virtnet_info
> > *vi,
> > >  	 */
> > >  	if (sq->vq->num_free < 2+MAX_SKB_FRAGS) {
> > >  		netif_stop_subqueue(dev, qnum);
> > > +		u64_stats_update_begin(&sq->stats.syncp);
> > > +		u64_stats_inc(&sq->stats.tx_stop);
> > > +		u64_stats_update_end(&sq->stats.syncp);
> > >  		if (use_napi) {
> > >  			if (unlikely(!virtqueue_enable_cb_delayed(sq->vq)))
> > >  				virtqueue_napi_schedule(&sq->napi, sq-
> > >vq); @@ -851,6 +858,9 @@
> > > static void check_sq_full_and_disable(struct virtnet_info *vi,
> > >  			free_old_xmit_skbs(sq, false);
> > >  			if (sq->vq->num_free >= 2+MAX_SKB_FRAGS) {
> > >  				netif_start_subqueue(dev, qnum);
> > > +				u64_stats_update_begin(&sq->stats.syncp);
> > > +				u64_stats_inc(&sq->stats.tx_wake);
> > > +				u64_stats_update_end(&sq->stats.syncp);
> > >  				virtqueue_disable_cb(sq->vq);
> > >  			}
> > >  		}
> > > @@ -2163,8 +2173,14 @@ static void virtnet_poll_cleantx(struct
> > receive_queue *rq)
> > >  			free_old_xmit_skbs(sq, true);
> > >  		} while (unlikely(!virtqueue_enable_cb_delayed(sq->vq)));
> > >
> > > -		if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS)
> > > +		if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) {
> > > +			if (netif_tx_queue_stopped(txq)) {
> > > +				u64_stats_update_begin(&sq->stats.syncp);
> > > +				u64_stats_inc(&sq->stats.tx_wake);
> > > +				u64_stats_update_end(&sq->stats.syncp);
> > > +			}
> > >  			netif_tx_wake_queue(txq);
> > > +		}
> > >
> > >  		__netif_tx_unlock(txq);
> > >  	}
> > > @@ -2310,8 +2326,14 @@ static int virtnet_poll_tx(struct napi_struct
> > *napi, int budget)
> > >  	virtqueue_disable_cb(sq->vq);
> > >  	free_old_xmit_skbs(sq, true);
> > >
> > > -	if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS)
> > > +	if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) {
> > > +		if (netif_tx_queue_stopped(txq)) {
> > > +			u64_stats_update_begin(&sq->stats.syncp);
> > > +			u64_stats_inc(&sq->stats.tx_wake);
> > > +			u64_stats_update_end(&sq->stats.syncp);
> > > +		}
> > >  		netif_tx_wake_queue(txq);
> > > +	}
> > >
> > >  	opaque = virtqueue_enable_cb_prepare(sq->vq);
> > >
> > > --
> > > 2.42.0

Dan Jurgens Jan. 30, 2024, 3:43 p.m. UTC | #5

> From: Heng Qi <hengqi@linux.alibaba.com>
> Sent: Tuesday, January 30, 2024 9:17 AM
> 在 2024/1/30 下午10:25, Daniel Jurgens 写道:
> > Add a tx queue stop and wake counters, they are useful for debugging.
> >
> > 	$ ethtool -S ens5f2 | grep 'tx_stop\|tx_wake'
> > 	...
> > 	tx_queue_1_tx_stop: 16726
> > 	tx_queue_1_tx_wake: 16726
> > 	...
> > 	tx_queue_8_tx_stop: 1500110
> > 	tx_queue_8_tx_wake: 1500110
> >
> > Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> > Reviewed-by: Parav Pandit <parav@nvidia.com>
> > ---
> >   drivers/net/virtio_net.c | 26 ++++++++++++++++++++++++--
> >   1 file changed, 24 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index
> > 3cb8aa193884..7e3c31ceaf7e 100644
> > --- a/drivers/net/virtio_net.c
> > +++ b/drivers/net/virtio_net.c
> > @@ -88,6 +88,8 @@ struct virtnet_sq_stats {
> >   	u64_stats_t xdp_tx_drops;
> >   	u64_stats_t kicks;
> >   	u64_stats_t tx_timeouts;
> > +	u64_stats_t tx_stop;
> > +	u64_stats_t tx_wake;
> >   };
> 
> Hi Daniel!
> 
> tx_stop/wake only counts the status in the I/O path.
> Do the status of virtnet_config_changed_work and virtnet_tx_resize need to
> be counted?
> 

My motivation for the counter is detecting full TX queues. I don't think counting them in the control path is useful, but it can be done if you disagree.

> Thanks,
> Heng
> 
> >
> >   struct virtnet_rq_stats {
> > @@ -112,6 +114,8 @@ static const struct virtnet_stat_desc
> virtnet_sq_stats_desc[] = {
> >   	{ "xdp_tx_drops",	VIRTNET_SQ_STAT(xdp_tx_drops) },
> >   	{ "kicks",		VIRTNET_SQ_STAT(kicks) },
> >   	{ "tx_timeouts",	VIRTNET_SQ_STAT(tx_timeouts) },
> > +	{ "tx_stop",		VIRTNET_SQ_STAT(tx_stop) },
> > +	{ "tx_wake",		VIRTNET_SQ_STAT(tx_wake) },
> >   };
> >
> >   static const struct virtnet_stat_desc virtnet_rq_stats_desc[] = { @@
> > -843,6 +847,9 @@ static void check_sq_full_and_disable(struct virtnet_info
> *vi,
> >   	 */
> >   	if (sq->vq->num_free < 2+MAX_SKB_FRAGS) {
> >   		netif_stop_subqueue(dev, qnum);
> > +		u64_stats_update_begin(&sq->stats.syncp);
> > +		u64_stats_inc(&sq->stats.tx_stop);
> > +		u64_stats_update_end(&sq->stats.syncp);
> >   		if (use_napi) {
> >   			if (unlikely(!virtqueue_enable_cb_delayed(sq->vq)))
> >   				virtqueue_napi_schedule(&sq->napi, sq-
> >vq); @@ -851,6 +858,9 @@
> > static void check_sq_full_and_disable(struct virtnet_info *vi,
> >   			free_old_xmit_skbs(sq, false);
> >   			if (sq->vq->num_free >= 2+MAX_SKB_FRAGS) {
> >   				netif_start_subqueue(dev, qnum);
> > +				u64_stats_update_begin(&sq->stats.syncp);
> > +				u64_stats_inc(&sq->stats.tx_wake);
> > +				u64_stats_update_end(&sq->stats.syncp);
> >   				virtqueue_disable_cb(sq->vq);
> >   			}
> >   		}
> > @@ -2163,8 +2173,14 @@ static void virtnet_poll_cleantx(struct
> receive_queue *rq)
> >   			free_old_xmit_skbs(sq, true);
> >   		} while (unlikely(!virtqueue_enable_cb_delayed(sq->vq)));
> >
> > -		if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS)
> > +		if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) {
> > +			if (netif_tx_queue_stopped(txq)) {
> > +				u64_stats_update_begin(&sq->stats.syncp);
> > +				u64_stats_inc(&sq->stats.tx_wake);
> > +				u64_stats_update_end(&sq->stats.syncp);
> > +			}
> >   			netif_tx_wake_queue(txq);
> > +		}
> >
> >   		__netif_tx_unlock(txq);
> >   	}
> > @@ -2310,8 +2326,14 @@ static int virtnet_poll_tx(struct napi_struct
> *napi, int budget)
> >   	virtqueue_disable_cb(sq->vq);
> >   	free_old_xmit_skbs(sq, true);
> >
> > -	if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS)
> > +	if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) {
> > +		if (netif_tx_queue_stopped(txq)) {
> > +			u64_stats_update_begin(&sq->stats.syncp);
> > +			u64_stats_inc(&sq->stats.tx_wake);
> > +			u64_stats_update_end(&sq->stats.syncp);
> > +		}
> >   		netif_tx_wake_queue(txq);
> > +	}
> >
> >   	opaque = virtqueue_enable_cb_prepare(sq->vq);
> >

Dan Jurgens Jan. 30, 2024, 3:50 p.m. UTC | #6

> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Tuesday, January 30, 2024 9:42 AM
> On Tue, Jan 30, 2024 at 03:40:21PM +0000, Daniel Jurgens wrote:
> > > From: Michael S. Tsirkin <mst@redhat.com>
> > > Sent: Tuesday, January 30, 2024 8:58 AM
> > >
> > > On Tue, Jan 30, 2024 at 08:25:21AM -0600, Daniel Jurgens wrote:
> > > > Add a tx queue stop and wake counters, they are useful for debugging.
> > > >
> > > > 	$ ethtool -S ens5f2 | grep 'tx_stop\|tx_wake'
> > > > 	...
> > > > 	tx_queue_1_tx_stop: 16726
> > > > 	tx_queue_1_tx_wake: 16726
> > > > 	...
> > > > 	tx_queue_8_tx_stop: 1500110
> > > > 	tx_queue_8_tx_wake: 1500110
> > > >
> > > > Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> > > > Reviewed-by: Parav Pandit <parav@nvidia.com>
> > >
> > > Hmm isn't one always same as the other, except when queue is stopped?
> > > And when it is stopped you can see that in the status?
> > > So how is having two useful?
> >
> > At idle the counters will be the same, unless a tx_timeout occurs. But
> under load they can be monitored to see which queues are stopped and get
> an idea of how long they are stopped.
> 
> how does it give you the idea of how long they are stopped?

By serially monitoring the counter you can see stops that persist long intervals that are less than the tx_timeout time.

> 
> > Other net drivers (not all), also have the wake counter.
> 
> Examples?

[danielj@sw-mtx-051 upstream]$ ethtool -i ens2f1np1
driver: mlx5_core                                  
version: 6.7.0+                                    
...
[danielj@sw-mtx-051 upstream]$ ethtool -S ens2f1np1 | grep wake
     tx_queue_wake: 0
     tx0_wake: 0

> 
> > In my opinion it makes the stop counter more useful, at little cost.
> >
> > >
> > >
> > > > ---
> > > >  drivers/net/virtio_net.c | 26 ++++++++++++++++++++++++--
> > > >  1 file changed, 24 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > > > index 3cb8aa193884..7e3c31ceaf7e 100644
> > > > --- a/drivers/net/virtio_net.c
> > > > +++ b/drivers/net/virtio_net.c
> > > > @@ -88,6 +88,8 @@ struct virtnet_sq_stats {
> > > >  	u64_stats_t xdp_tx_drops;
> > > >  	u64_stats_t kicks;
> > > >  	u64_stats_t tx_timeouts;
> > > > +	u64_stats_t tx_stop;
> > > > +	u64_stats_t tx_wake;
> > > >  };
> > > >
> > > >  struct virtnet_rq_stats {
> > > > @@ -112,6 +114,8 @@ static const struct virtnet_stat_desc
> > > virtnet_sq_stats_desc[] = {
> > > >  	{ "xdp_tx_drops",	VIRTNET_SQ_STAT(xdp_tx_drops) },
> > > >  	{ "kicks",		VIRTNET_SQ_STAT(kicks) },
> > > >  	{ "tx_timeouts",	VIRTNET_SQ_STAT(tx_timeouts) },
> > > > +	{ "tx_stop",		VIRTNET_SQ_STAT(tx_stop) },
> > > > +	{ "tx_wake",		VIRTNET_SQ_STAT(tx_wake) },
> > > >  };
> > > >
> > > >  static const struct virtnet_stat_desc virtnet_rq_stats_desc[] = {
> > > > @@
> > > > -843,6 +847,9 @@ static void check_sq_full_and_disable(struct
> > > > virtnet_info
> > > *vi,
> > > >  	 */
> > > >  	if (sq->vq->num_free < 2+MAX_SKB_FRAGS) {
> > > >  		netif_stop_subqueue(dev, qnum);
> > > > +		u64_stats_update_begin(&sq->stats.syncp);
> > > > +		u64_stats_inc(&sq->stats.tx_stop);
> > > > +		u64_stats_update_end(&sq->stats.syncp);
> > > >  		if (use_napi) {
> > > >  			if (unlikely(!virtqueue_enable_cb_delayed(sq->vq)))
> > > >  				virtqueue_napi_schedule(&sq->napi, sq- vq);
> @@ -851,6 +858,9
> > > >@@  static void check_sq_full_and_disable(struct virtnet_info *vi,
> > > >  			free_old_xmit_skbs(sq, false);
> > > >  			if (sq->vq->num_free >= 2+MAX_SKB_FRAGS) {
> > > >  				netif_start_subqueue(dev, qnum);
> > > > +				u64_stats_update_begin(&sq->stats.syncp);
> > > > +				u64_stats_inc(&sq->stats.tx_wake);
> > > > +				u64_stats_update_end(&sq->stats.syncp);
> > > >  				virtqueue_disable_cb(sq->vq);
> > > >  			}
> > > >  		}
> > > > @@ -2163,8 +2173,14 @@ static void virtnet_poll_cleantx(struct
> > > receive_queue *rq)
> > > >  			free_old_xmit_skbs(sq, true);
> > > >  		} while (unlikely(!virtqueue_enable_cb_delayed(sq->vq)));
> > > >
> > > > -		if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS)
> > > > +		if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) {
> > > > +			if (netif_tx_queue_stopped(txq)) {
> > > > +				u64_stats_update_begin(&sq->stats.syncp);
> > > > +				u64_stats_inc(&sq->stats.tx_wake);
> > > > +				u64_stats_update_end(&sq->stats.syncp);
> > > > +			}
> > > >  			netif_tx_wake_queue(txq);
> > > > +		}
> > > >
> > > >  		__netif_tx_unlock(txq);
> > > >  	}
> > > > @@ -2310,8 +2326,14 @@ static int virtnet_poll_tx(struct
> > > > napi_struct
> > > *napi, int budget)
> > > >  	virtqueue_disable_cb(sq->vq);
> > > >  	free_old_xmit_skbs(sq, true);
> > > >
> > > > -	if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS)
> > > > +	if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) {
> > > > +		if (netif_tx_queue_stopped(txq)) {
> > > > +			u64_stats_update_begin(&sq->stats.syncp);
> > > > +			u64_stats_inc(&sq->stats.tx_wake);
> > > > +			u64_stats_update_end(&sq->stats.syncp);
> > > > +		}
> > > >  		netif_tx_wake_queue(txq);
> > > > +	}
> > > >
> > > >  	opaque = virtqueue_enable_cb_prepare(sq->vq);
> > > >
> > > > --
> > > > 2.42.0

Michael S. Tsirkin Jan. 30, 2024, 3:52 p.m. UTC | #7

On Tue, Jan 30, 2024 at 03:43:48PM +0000, Daniel Jurgens wrote:
> > From: Heng Qi <hengqi@linux.alibaba.com>
> > Sent: Tuesday, January 30, 2024 9:17 AM
> > 在 2024/1/30 下午10:25, Daniel Jurgens 写道:
> > > Add a tx queue stop and wake counters, they are useful for debugging.
> > >
> > > 	$ ethtool -S ens5f2 | grep 'tx_stop\|tx_wake'
> > > 	...
> > > 	tx_queue_1_tx_stop: 16726
> > > 	tx_queue_1_tx_wake: 16726
> > > 	...
> > > 	tx_queue_8_tx_stop: 1500110
> > > 	tx_queue_8_tx_wake: 1500110
> > >
> > > Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> > > Reviewed-by: Parav Pandit <parav@nvidia.com>
> > > ---
> > >   drivers/net/virtio_net.c | 26 ++++++++++++++++++++++++--
> > >   1 file changed, 24 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index
> > > 3cb8aa193884..7e3c31ceaf7e 100644
> > > --- a/drivers/net/virtio_net.c
> > > +++ b/drivers/net/virtio_net.c
> > > @@ -88,6 +88,8 @@ struct virtnet_sq_stats {
> > >   	u64_stats_t xdp_tx_drops;
> > >   	u64_stats_t kicks;
> > >   	u64_stats_t tx_timeouts;
> > > +	u64_stats_t tx_stop;
> > > +	u64_stats_t tx_wake;
> > >   };
> > 
> > Hi Daniel!
> > 
> > tx_stop/wake only counts the status in the I/O path.
> > Do the status of virtnet_config_changed_work and virtnet_tx_resize need to
> > be counted?
> > 
> 
> My motivation for the counter is detecting full TX queues. I don't think counting them in the control path is useful, but it can be done if you disagree.

Do we then just want "tx full" counter?

> > Thanks,
> > Heng
> > 
> > >
> > >   struct virtnet_rq_stats {
> > > @@ -112,6 +114,8 @@ static const struct virtnet_stat_desc
> > virtnet_sq_stats_desc[] = {
> > >   	{ "xdp_tx_drops",	VIRTNET_SQ_STAT(xdp_tx_drops) },
> > >   	{ "kicks",		VIRTNET_SQ_STAT(kicks) },
> > >   	{ "tx_timeouts",	VIRTNET_SQ_STAT(tx_timeouts) },
> > > +	{ "tx_stop",		VIRTNET_SQ_STAT(tx_stop) },
> > > +	{ "tx_wake",		VIRTNET_SQ_STAT(tx_wake) },
> > >   };
> > >
> > >   static const struct virtnet_stat_desc virtnet_rq_stats_desc[] = { @@
> > > -843,6 +847,9 @@ static void check_sq_full_and_disable(struct virtnet_info
> > *vi,
> > >   	 */
> > >   	if (sq->vq->num_free < 2+MAX_SKB_FRAGS) {
> > >   		netif_stop_subqueue(dev, qnum);
> > > +		u64_stats_update_begin(&sq->stats.syncp);
> > > +		u64_stats_inc(&sq->stats.tx_stop);
> > > +		u64_stats_update_end(&sq->stats.syncp);
> > >   		if (use_napi) {
> > >   			if (unlikely(!virtqueue_enable_cb_delayed(sq->vq)))
> > >   				virtqueue_napi_schedule(&sq->napi, sq-
> > >vq); @@ -851,6 +858,9 @@
> > > static void check_sq_full_and_disable(struct virtnet_info *vi,
> > >   			free_old_xmit_skbs(sq, false);
> > >   			if (sq->vq->num_free >= 2+MAX_SKB_FRAGS) {
> > >   				netif_start_subqueue(dev, qnum);
> > > +				u64_stats_update_begin(&sq->stats.syncp);
> > > +				u64_stats_inc(&sq->stats.tx_wake);
> > > +				u64_stats_update_end(&sq->stats.syncp);
> > >   				virtqueue_disable_cb(sq->vq);
> > >   			}
> > >   		}
> > > @@ -2163,8 +2173,14 @@ static void virtnet_poll_cleantx(struct
> > receive_queue *rq)
> > >   			free_old_xmit_skbs(sq, true);
> > >   		} while (unlikely(!virtqueue_enable_cb_delayed(sq->vq)));
> > >
> > > -		if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS)
> > > +		if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) {
> > > +			if (netif_tx_queue_stopped(txq)) {
> > > +				u64_stats_update_begin(&sq->stats.syncp);
> > > +				u64_stats_inc(&sq->stats.tx_wake);
> > > +				u64_stats_update_end(&sq->stats.syncp);
> > > +			}
> > >   			netif_tx_wake_queue(txq);
> > > +		}
> > >
> > >   		__netif_tx_unlock(txq);
> > >   	}
> > > @@ -2310,8 +2326,14 @@ static int virtnet_poll_tx(struct napi_struct
> > *napi, int budget)
> > >   	virtqueue_disable_cb(sq->vq);
> > >   	free_old_xmit_skbs(sq, true);
> > >
> > > -	if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS)
> > > +	if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) {
> > > +		if (netif_tx_queue_stopped(txq)) {
> > > +			u64_stats_update_begin(&sq->stats.syncp);
> > > +			u64_stats_inc(&sq->stats.tx_wake);
> > > +			u64_stats_update_end(&sq->stats.syncp);
> > > +		}
> > >   		netif_tx_wake_queue(txq);
> > > +	}
> > >
> > >   	opaque = virtqueue_enable_cb_prepare(sq->vq);
> > >
>

Michael S. Tsirkin Jan. 30, 2024, 3:53 p.m. UTC | #8

On Tue, Jan 30, 2024 at 03:50:29PM +0000, Daniel Jurgens wrote:
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Tuesday, January 30, 2024 9:42 AM
> > On Tue, Jan 30, 2024 at 03:40:21PM +0000, Daniel Jurgens wrote:
> > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > Sent: Tuesday, January 30, 2024 8:58 AM
> > > >
> > > > On Tue, Jan 30, 2024 at 08:25:21AM -0600, Daniel Jurgens wrote:
> > > > > Add a tx queue stop and wake counters, they are useful for debugging.
> > > > >
> > > > > 	$ ethtool -S ens5f2 | grep 'tx_stop\|tx_wake'
> > > > > 	...
> > > > > 	tx_queue_1_tx_stop: 16726
> > > > > 	tx_queue_1_tx_wake: 16726
> > > > > 	...
> > > > > 	tx_queue_8_tx_stop: 1500110
> > > > > 	tx_queue_8_tx_wake: 1500110
> > > > >
> > > > > Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> > > > > Reviewed-by: Parav Pandit <parav@nvidia.com>
> > > >
> > > > Hmm isn't one always same as the other, except when queue is stopped?
> > > > And when it is stopped you can see that in the status?
> > > > So how is having two useful?
> > >
> > > At idle the counters will be the same, unless a tx_timeout occurs. But
> > under load they can be monitored to see which queues are stopped and get
> > an idea of how long they are stopped.
> > 
> > how does it give you the idea of how long they are stopped?
> 
> By serially monitoring the counter you can see stops that persist long intervals that are less than the tx_timeout time.

Why don't you monitor queue status directly?

> > 
> > > Other net drivers (not all), also have the wake counter.
> > 
> > Examples?
> 
> [danielj@sw-mtx-051 upstream]$ ethtool -i ens2f1np1
> driver: mlx5_core                                  
> version: 6.7.0+                                    
> ...
> [danielj@sw-mtx-051 upstream]$ ethtool -S ens2f1np1 | grep wake
>      tx_queue_wake: 0
>      tx0_wake: 0

Do they have a stop counter too?

> > 
> > > In my opinion it makes the stop counter more useful, at little cost.
> > >
> > > >
> > > >
> > > > > ---
> > > > >  drivers/net/virtio_net.c | 26 ++++++++++++++++++++++++--
> > > > >  1 file changed, 24 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > > > > index 3cb8aa193884..7e3c31ceaf7e 100644
> > > > > --- a/drivers/net/virtio_net.c
> > > > > +++ b/drivers/net/virtio_net.c
> > > > > @@ -88,6 +88,8 @@ struct virtnet_sq_stats {
> > > > >  	u64_stats_t xdp_tx_drops;
> > > > >  	u64_stats_t kicks;
> > > > >  	u64_stats_t tx_timeouts;
> > > > > +	u64_stats_t tx_stop;
> > > > > +	u64_stats_t tx_wake;
> > > > >  };
> > > > >
> > > > >  struct virtnet_rq_stats {
> > > > > @@ -112,6 +114,8 @@ static const struct virtnet_stat_desc
> > > > virtnet_sq_stats_desc[] = {
> > > > >  	{ "xdp_tx_drops",	VIRTNET_SQ_STAT(xdp_tx_drops) },
> > > > >  	{ "kicks",		VIRTNET_SQ_STAT(kicks) },
> > > > >  	{ "tx_timeouts",	VIRTNET_SQ_STAT(tx_timeouts) },
> > > > > +	{ "tx_stop",		VIRTNET_SQ_STAT(tx_stop) },
> > > > > +	{ "tx_wake",		VIRTNET_SQ_STAT(tx_wake) },
> > > > >  };
> > > > >
> > > > >  static const struct virtnet_stat_desc virtnet_rq_stats_desc[] = {
> > > > > @@
> > > > > -843,6 +847,9 @@ static void check_sq_full_and_disable(struct
> > > > > virtnet_info
> > > > *vi,
> > > > >  	 */
> > > > >  	if (sq->vq->num_free < 2+MAX_SKB_FRAGS) {
> > > > >  		netif_stop_subqueue(dev, qnum);
> > > > > +		u64_stats_update_begin(&sq->stats.syncp);
> > > > > +		u64_stats_inc(&sq->stats.tx_stop);
> > > > > +		u64_stats_update_end(&sq->stats.syncp);
> > > > >  		if (use_napi) {
> > > > >  			if (unlikely(!virtqueue_enable_cb_delayed(sq->vq)))
> > > > >  				virtqueue_napi_schedule(&sq->napi, sq- vq);
> > @@ -851,6 +858,9
> > > > >@@  static void check_sq_full_and_disable(struct virtnet_info *vi,
> > > > >  			free_old_xmit_skbs(sq, false);
> > > > >  			if (sq->vq->num_free >= 2+MAX_SKB_FRAGS) {
> > > > >  				netif_start_subqueue(dev, qnum);
> > > > > +				u64_stats_update_begin(&sq->stats.syncp);
> > > > > +				u64_stats_inc(&sq->stats.tx_wake);
> > > > > +				u64_stats_update_end(&sq->stats.syncp);
> > > > >  				virtqueue_disable_cb(sq->vq);
> > > > >  			}
> > > > >  		}
> > > > > @@ -2163,8 +2173,14 @@ static void virtnet_poll_cleantx(struct
> > > > receive_queue *rq)
> > > > >  			free_old_xmit_skbs(sq, true);
> > > > >  		} while (unlikely(!virtqueue_enable_cb_delayed(sq->vq)));
> > > > >
> > > > > -		if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS)
> > > > > +		if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) {
> > > > > +			if (netif_tx_queue_stopped(txq)) {
> > > > > +				u64_stats_update_begin(&sq->stats.syncp);
> > > > > +				u64_stats_inc(&sq->stats.tx_wake);
> > > > > +				u64_stats_update_end(&sq->stats.syncp);
> > > > > +			}
> > > > >  			netif_tx_wake_queue(txq);
> > > > > +		}
> > > > >
> > > > >  		__netif_tx_unlock(txq);
> > > > >  	}
> > > > > @@ -2310,8 +2326,14 @@ static int virtnet_poll_tx(struct
> > > > > napi_struct
> > > > *napi, int budget)
> > > > >  	virtqueue_disable_cb(sq->vq);
> > > > >  	free_old_xmit_skbs(sq, true);
> > > > >
> > > > > -	if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS)
> > > > > +	if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) {
> > > > > +		if (netif_tx_queue_stopped(txq)) {
> > > > > +			u64_stats_update_begin(&sq->stats.syncp);
> > > > > +			u64_stats_inc(&sq->stats.tx_wake);
> > > > > +			u64_stats_update_end(&sq->stats.syncp);
> > > > > +		}
> > > > >  		netif_tx_wake_queue(txq);
> > > > > +	}
> > > > >
> > > > >  	opaque = virtqueue_enable_cb_prepare(sq->vq);
> > > > >
> > > > > --
> > > > > 2.42.0

Dan Jurgens Jan. 30, 2024, 5:33 p.m. UTC | #9

> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Tuesday, January 30, 2024 9:53 AM
> On Tue, Jan 30, 2024 at 03:50:29PM +0000, Daniel Jurgens wrote:
> > > From: Michael S. Tsirkin <mst@redhat.com>
> > > Sent: Tuesday, January 30, 2024 9:42 AM On Tue, Jan 30, 2024 at
> > > 03:40:21PM +0000, Daniel Jurgens wrote:
> > > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > > Sent: Tuesday, January 30, 2024 8:58 AM
> > > > >
> > > > > On Tue, Jan 30, 2024 at 08:25:21AM -0600, Daniel Jurgens wrote:
> > > > > > Add a tx queue stop and wake counters, they are useful for
> debugging.
> > > > > >
> > > > > > 	$ ethtool -S ens5f2 | grep 'tx_stop\|tx_wake'
> > > > > > 	...
> > > > > > 	tx_queue_1_tx_stop: 16726
> > > > > > 	tx_queue_1_tx_wake: 16726
> > > > > > 	...
> > > > > > 	tx_queue_8_tx_stop: 1500110
> > > > > > 	tx_queue_8_tx_wake: 1500110
> > > > > >
> > > > > > Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> > > > > > Reviewed-by: Parav Pandit <parav@nvidia.com>
> > > > >
> > > > > Hmm isn't one always same as the other, except when queue is
> stopped?
> > > > > And when it is stopped you can see that in the status?
> > > > > So how is having two useful?
> > > >
> > > > At idle the counters will be the same, unless a tx_timeout occurs.
> > > > But
> > > under load they can be monitored to see which queues are stopped and
> > > get an idea of how long they are stopped.
> > >
> > > how does it give you the idea of how long they are stopped?
> >
> > By serially monitoring the counter you can see stops that persist long
> intervals that are less than the tx_timeout time.
> 
> Why don't you monitor queue status directly?

How? I don't know of any interface to check if a queue is stopped.

> 
> > >
> > > > Other net drivers (not all), also have the wake counter.
> > >
> > > Examples?
> >
> > [danielj@sw-mtx-051 upstream]$ ethtool -i ens2f1np1
> > driver: mlx5_core
> > version: 6.7.0+
> > ...
> > [danielj@sw-mtx-051 upstream]$ ethtool -S ens2f1np1 | grep wake
> >      tx_queue_wake: 0
> >      tx0_wake: 0
> 
> Do they have a stop counter too?

Yes:
[danielj@sw-mtx-051 upstream]$ ethtool -S ens2f1np1 | grep 'stop\|wake'
     tx_queue_stopped: 0
     tx_queue_wake: 0
     tx0_stopped: 0
     tx0_wake: 0
     ....

> 
> > >
> > > > In my opinion it makes the stop counter more useful, at little cost.
> > > >
> > > > >
> > > > >
> > > > > > ---
> > > > > >  drivers/net/virtio_net.c | 26 ++++++++++++++++++++++++--
> > > > > >  1 file changed, 24 insertions(+), 2 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/net/virtio_net.c
> > > > > > b/drivers/net/virtio_net.c index 3cb8aa193884..7e3c31ceaf7e
> > > > > > 100644
> > > > > > --- a/drivers/net/virtio_net.c
> > > > > > +++ b/drivers/net/virtio_net.c
> > > > > > @@ -88,6 +88,8 @@ struct virtnet_sq_stats {
> > > > > >  	u64_stats_t xdp_tx_drops;
> > > > > >  	u64_stats_t kicks;
> > > > > >  	u64_stats_t tx_timeouts;
> > > > > > +	u64_stats_t tx_stop;
> > > > > > +	u64_stats_t tx_wake;
> > > > > >  };
> > > > > >
> > > > > >  struct virtnet_rq_stats {
> > > > > > @@ -112,6 +114,8 @@ static const struct virtnet_stat_desc
> > > > > virtnet_sq_stats_desc[] = {
> > > > > >  	{ "xdp_tx_drops",	VIRTNET_SQ_STAT(xdp_tx_drops) },
> > > > > >  	{ "kicks",		VIRTNET_SQ_STAT(kicks) },
> > > > > >  	{ "tx_timeouts",	VIRTNET_SQ_STAT(tx_timeouts) },
> > > > > > +	{ "tx_stop",		VIRTNET_SQ_STAT(tx_stop) },
> > > > > > +	{ "tx_wake",		VIRTNET_SQ_STAT(tx_wake) },
> > > > > >  };
> > > > > >
> > > > > >  static const struct virtnet_stat_desc virtnet_rq_stats_desc[]
> > > > > > = { @@
> > > > > > -843,6 +847,9 @@ static void check_sq_full_and_disable(struct
> > > > > > virtnet_info
> > > > > *vi,
> > > > > >  	 */
> > > > > >  	if (sq->vq->num_free < 2+MAX_SKB_FRAGS) {
> > > > > >  		netif_stop_subqueue(dev, qnum);
> > > > > > +		u64_stats_update_begin(&sq->stats.syncp);
> > > > > > +		u64_stats_inc(&sq->stats.tx_stop);
> > > > > > +		u64_stats_update_end(&sq->stats.syncp);
> > > > > >  		if (use_napi) {
> > > > > >  			if
> (unlikely(!virtqueue_enable_cb_delayed(sq->vq)))
> > > > > >  				virtqueue_napi_schedule(&sq->napi,
> sq- vq);
> > > @@ -851,6 +858,9
> > > > > >@@  static void check_sq_full_and_disable(struct virtnet_info *vi,
> > > > > >  			free_old_xmit_skbs(sq, false);
> > > > > >  			if (sq->vq->num_free >=
> 2+MAX_SKB_FRAGS) {
> > > > > >  				netif_start_subqueue(dev, qnum);
> > > > > > +				u64_stats_update_begin(&sq-
> >stats.syncp);
> > > > > > +				u64_stats_inc(&sq->stats.tx_wake);
> > > > > > +				u64_stats_update_end(&sq-
> >stats.syncp);
> > > > > >  				virtqueue_disable_cb(sq->vq);
> > > > > >  			}
> > > > > >  		}
> > > > > > @@ -2163,8 +2173,14 @@ static void virtnet_poll_cleantx(struct
> > > > > receive_queue *rq)
> > > > > >  			free_old_xmit_skbs(sq, true);
> > > > > >  		} while (unlikely(!virtqueue_enable_cb_delayed(sq-
> >vq)));
> > > > > >
> > > > > > -		if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS)
> > > > > > +		if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) {
> > > > > > +			if (netif_tx_queue_stopped(txq)) {
> > > > > > +				u64_stats_update_begin(&sq-
> >stats.syncp);
> > > > > > +				u64_stats_inc(&sq->stats.tx_wake);
> > > > > > +				u64_stats_update_end(&sq-
> >stats.syncp);
> > > > > > +			}
> > > > > >  			netif_tx_wake_queue(txq);
> > > > > > +		}
> > > > > >
> > > > > >  		__netif_tx_unlock(txq);
> > > > > >  	}
> > > > > > @@ -2310,8 +2326,14 @@ static int virtnet_poll_tx(struct
> > > > > > napi_struct
> > > > > *napi, int budget)
> > > > > >  	virtqueue_disable_cb(sq->vq);
> > > > > >  	free_old_xmit_skbs(sq, true);
> > > > > >
> > > > > > -	if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS)
> > > > > > +	if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) {
> > > > > > +		if (netif_tx_queue_stopped(txq)) {
> > > > > > +			u64_stats_update_begin(&sq->stats.syncp);
> > > > > > +			u64_stats_inc(&sq->stats.tx_wake);
> > > > > > +			u64_stats_update_end(&sq->stats.syncp);
> > > > > > +		}
> > > > > >  		netif_tx_wake_queue(txq);
> > > > > > +	}
> > > > > >
> > > > > >  	opaque = virtqueue_enable_cb_prepare(sq->vq);
> > > > > >
> > > > > > --
> > > > > > 2.42.0

Jason Xing Jan. 31, 2024, 2:54 a.m. UTC | #10

On Wed, Jan 31, 2024 at 1:53 AM Daniel Jurgens <danielj@nvidia.com> wrote:
>
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Tuesday, January 30, 2024 9:53 AM
> > On Tue, Jan 30, 2024 at 03:50:29PM +0000, Daniel Jurgens wrote:
> > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > Sent: Tuesday, January 30, 2024 9:42 AM On Tue, Jan 30, 2024 at
> > > > 03:40:21PM +0000, Daniel Jurgens wrote:
> > > > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > > > Sent: Tuesday, January 30, 2024 8:58 AM
> > > > > >
> > > > > > On Tue, Jan 30, 2024 at 08:25:21AM -0600, Daniel Jurgens wrote:
> > > > > > > Add a tx queue stop and wake counters, they are useful for
> > debugging.
> > > > > > >
> > > > > > >     $ ethtool -S ens5f2 | grep 'tx_stop\|tx_wake'
> > > > > > >     ...
> > > > > > >     tx_queue_1_tx_stop: 16726
> > > > > > >     tx_queue_1_tx_wake: 16726
> > > > > > >     ...
> > > > > > >     tx_queue_8_tx_stop: 1500110
> > > > > > >     tx_queue_8_tx_wake: 1500110
> > > > > > >
> > > > > > > Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> > > > > > > Reviewed-by: Parav Pandit <parav@nvidia.com>
> > > > > >
> > > > > > Hmm isn't one always same as the other, except when queue is
> > stopped?
> > > > > > And when it is stopped you can see that in the status?
> > > > > > So how is having two useful?
> > > > >
> > > > > At idle the counters will be the same, unless a tx_timeout occurs.
> > > > > But
> > > > under load they can be monitored to see which queues are stopped and
> > > > get an idea of how long they are stopped.
> > > >
> > > > how does it give you the idea of how long they are stopped?
> > >
> > > By serially monitoring the counter you can see stops that persist long
> > intervals that are less than the tx_timeout time.
> >
> > Why don't you monitor queue status directly?
>
> How? I don't know of any interface to check if a queue is stopped.
>
> >
> > > >
> > > > > Other net drivers (not all), also have the wake counter.
> > > >
> > > > Examples?
> > >
> > > [danielj@sw-mtx-051 upstream]$ ethtool -i ens2f1np1
> > > driver: mlx5_core
> > > version: 6.7.0+
> > > ...
> > > [danielj@sw-mtx-051 upstream]$ ethtool -S ens2f1np1 | grep wake
> > >      tx_queue_wake: 0
> > >      tx0_wake: 0
> >
[...]
> > Do they have a stop counter too?
>
> Yes:
> [danielj@sw-mtx-051 upstream]$ ethtool -S ens2f1np1 | grep 'stop\|wake'
>      tx_queue_stopped: 0
>      tx_queue_wake: 0
>      tx0_stopped: 0
>      tx0_wake: 0
>      ....

Yes, that's it! What I know is that only mlx drivers have those two
counters, but they are very useful when debugging some issues or
tracking some historical changes if we want to.

Thanks,
Jason

>
> >
> > > >
> > > > > In my opinion it makes the stop counter more useful, at little cost.
> > > > >
> > > > > >
> > > > > >
> > > > > > > ---
> > > > > > >  drivers/net/virtio_net.c | 26 ++++++++++++++++++++++++--
> > > > > > >  1 file changed, 24 insertions(+), 2 deletions(-)
> > > > > > >
> > > > > > > diff --git a/drivers/net/virtio_net.c
> > > > > > > b/drivers/net/virtio_net.c index 3cb8aa193884..7e3c31ceaf7e
> > > > > > > 100644
> > > > > > > --- a/drivers/net/virtio_net.c
> > > > > > > +++ b/drivers/net/virtio_net.c
> > > > > > > @@ -88,6 +88,8 @@ struct virtnet_sq_stats {
> > > > > > >     u64_stats_t xdp_tx_drops;
> > > > > > >     u64_stats_t kicks;
> > > > > > >     u64_stats_t tx_timeouts;
> > > > > > > +   u64_stats_t tx_stop;
> > > > > > > +   u64_stats_t tx_wake;
> > > > > > >  };
> > > > > > >
> > > > > > >  struct virtnet_rq_stats {
> > > > > > > @@ -112,6 +114,8 @@ static const struct virtnet_stat_desc
> > > > > > virtnet_sq_stats_desc[] = {
> > > > > > >     { "xdp_tx_drops",       VIRTNET_SQ_STAT(xdp_tx_drops) },
> > > > > > >     { "kicks",              VIRTNET_SQ_STAT(kicks) },
> > > > > > >     { "tx_timeouts",        VIRTNET_SQ_STAT(tx_timeouts) },
> > > > > > > +   { "tx_stop",            VIRTNET_SQ_STAT(tx_stop) },
> > > > > > > +   { "tx_wake",            VIRTNET_SQ_STAT(tx_wake) },
> > > > > > >  };
> > > > > > >
> > > > > > >  static const struct virtnet_stat_desc virtnet_rq_stats_desc[]
> > > > > > > = { @@
> > > > > > > -843,6 +847,9 @@ static void check_sq_full_and_disable(struct
> > > > > > > virtnet_info
> > > > > > *vi,
> > > > > > >      */
> > > > > > >     if (sq->vq->num_free < 2+MAX_SKB_FRAGS) {
> > > > > > >             netif_stop_subqueue(dev, qnum);
> > > > > > > +           u64_stats_update_begin(&sq->stats.syncp);
> > > > > > > +           u64_stats_inc(&sq->stats.tx_stop);
> > > > > > > +           u64_stats_update_end(&sq->stats.syncp);
> > > > > > >             if (use_napi) {
> > > > > > >                     if
> > (unlikely(!virtqueue_enable_cb_delayed(sq->vq)))
> > > > > > >                             virtqueue_napi_schedule(&sq->napi,
> > sq- vq);
> > > > @@ -851,6 +858,9
> > > > > > >@@  static void check_sq_full_and_disable(struct virtnet_info *vi,
> > > > > > >                     free_old_xmit_skbs(sq, false);
> > > > > > >                     if (sq->vq->num_free >=
> > 2+MAX_SKB_FRAGS) {
> > > > > > >                             netif_start_subqueue(dev, qnum);
> > > > > > > +                           u64_stats_update_begin(&sq-
> > >stats.syncp);
> > > > > > > +                           u64_stats_inc(&sq->stats.tx_wake);
> > > > > > > +                           u64_stats_update_end(&sq-
> > >stats.syncp);
> > > > > > >                             virtqueue_disable_cb(sq->vq);
> > > > > > >                     }
> > > > > > >             }
> > > > > > > @@ -2163,8 +2173,14 @@ static void virtnet_poll_cleantx(struct
> > > > > > receive_queue *rq)
> > > > > > >                     free_old_xmit_skbs(sq, true);
> > > > > > >             } while (unlikely(!virtqueue_enable_cb_delayed(sq-
> > >vq)));
> > > > > > >
> > > > > > > -           if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS)
> > > > > > > +           if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) {
> > > > > > > +                   if (netif_tx_queue_stopped(txq)) {
> > > > > > > +                           u64_stats_update_begin(&sq-
> > >stats.syncp);
> > > > > > > +                           u64_stats_inc(&sq->stats.tx_wake);
> > > > > > > +                           u64_stats_update_end(&sq-
> > >stats.syncp);
> > > > > > > +                   }
> > > > > > >                     netif_tx_wake_queue(txq);
> > > > > > > +           }
> > > > > > >
> > > > > > >             __netif_tx_unlock(txq);
> > > > > > >     }
> > > > > > > @@ -2310,8 +2326,14 @@ static int virtnet_poll_tx(struct
> > > > > > > napi_struct
> > > > > > *napi, int budget)
> > > > > > >     virtqueue_disable_cb(sq->vq);
> > > > > > >     free_old_xmit_skbs(sq, true);
> > > > > > >
> > > > > > > -   if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS)
> > > > > > > +   if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) {
> > > > > > > +           if (netif_tx_queue_stopped(txq)) {
> > > > > > > +                   u64_stats_update_begin(&sq->stats.syncp);
> > > > > > > +                   u64_stats_inc(&sq->stats.tx_wake);
> > > > > > > +                   u64_stats_update_end(&sq->stats.syncp);
> > > > > > > +           }
> > > > > > >             netif_tx_wake_queue(txq);
> > > > > > > +   }
> > > > > > >
> > > > > > >     opaque = virtqueue_enable_cb_prepare(sq->vq);
> > > > > > >
> > > > > > > --
> > > > > > > 2.42.0
>
>

Jakub Kicinski Feb. 2, 2024, 4:21 a.m. UTC | #11

On Wed, 31 Jan 2024 10:54:33 +0800 Jason Xing wrote:
> > [danielj@sw-mtx-051 upstream]$ ethtool -S ens2f1np1 | grep 'stop\|wake'
> >      tx_queue_stopped: 0
> >      tx_queue_wake: 0
> >      tx0_stopped: 0
> >      tx0_wake: 0
> >      ....  
> 
> Yes, that's it! What I know is that only mlx drivers have those two
> counters, but they are very useful when debugging some issues or
> tracking some historical changes if we want to.

Can you say more? I'm curious what's your use case.

Jason Xing Feb. 2, 2024, 6:52 a.m. UTC | #12

On Fri, Feb 2, 2024 at 12:21 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Wed, 31 Jan 2024 10:54:33 +0800 Jason Xing wrote:
> > > [danielj@sw-mtx-051 upstream]$ ethtool -S ens2f1np1 | grep 'stop\|wake'
> > >      tx_queue_stopped: 0
> > >      tx_queue_wake: 0
> > >      tx0_stopped: 0
> > >      tx0_wake: 0
> > >      ....
> >
> > Yes, that's it! What I know is that only mlx drivers have those two
> > counters, but they are very useful when debugging some issues or
> > tracking some historical changes if we want to.
>
> Can you say more? I'm curious what's your use case.

I'm not working at Nvidia, so my point of view may differ from theirs.
From what I can tell is that those two counters help me narrow down
the range if I have to diagnose/debug some issues.
1) I sometimes notice that if some irq is held too long (say, one
simple case: output of printk printed to the console), those two
counters can reflect the issue.
2) Similarly in virtio net, recently I traced such counters the
current kernel does not have and it turned out that one of the output
queues in the backend behaves badly.
...

Stop/wake queue counters may not show directly the root cause of the
issue, but help us 'guess' to some extent.

Thanks,
Jason

Jakub Kicinski Feb. 2, 2024, 4:01 p.m. UTC | #13

On Fri, 2 Feb 2024 14:52:59 +0800 Jason Xing wrote:
> > Can you say more? I'm curious what's your use case.  
> 
> I'm not working at Nvidia, so my point of view may differ from theirs.
> From what I can tell is that those two counters help me narrow down
> the range if I have to diagnose/debug some issues.

right, i'm asking to collect useful debugging tricks, nothing against
the patch itself :)

> 1) I sometimes notice that if some irq is held too long (say, one
> simple case: output of printk printed to the console), those two
> counters can reflect the issue.
> 2) Similarly in virtio net, recently I traced such counters the
> current kernel does not have and it turned out that one of the output
> queues in the backend behaves badly.
> ...
> 
> Stop/wake queue counters may not show directly the root cause of the
> issue, but help us 'guess' to some extent.

I'm surprised you say you can detect stall-related issues with this.
I guess virtio doesn't have BQL support, which makes it special.
Normal HW drivers with BQL almost never stop the queue by themselves.
I mean - if they do, and BQL is active, then the system is probably
misconfigured (queue is too short). This is what we use at Meta to
detect stalls in drivers with BQL:

https://lore.kernel.org/all/20240131102150.728960-3-leitao@debian.org/

Daniel, I think this may be a good enough excuse to add per-queue stats
to the netdev genl family, if you're up for that. LMK if you want more
info, otherwise I guess ethtool -S is fine for now.

Dan Jurgens Feb. 2, 2024, 4:46 p.m. UTC | #14

> From: Jakub Kicinski <kuba@kernel.org>
> Sent: Friday, February 2, 2024 10:01 AM
> Subject: Re: [PATCH net-next] virtio_net: Add TX stop and wake counters
> 
> On Fri, 2 Feb 2024 14:52:59 +0800 Jason Xing wrote:
> > > Can you say more? I'm curious what's your use case.
> >
> > I'm not working at Nvidia, so my point of view may differ from theirs.
> > From what I can tell is that those two counters help me narrow down
> > the range if I have to diagnose/debug some issues.
> 
> right, i'm asking to collect useful debugging tricks, nothing against the patch
> itself :)
> 
> > 1) I sometimes notice that if some irq is held too long (say, one
> > simple case: output of printk printed to the console), those two
> > counters can reflect the issue.
> > 2) Similarly in virtio net, recently I traced such counters the
> > current kernel does not have and it turned out that one of the output
> > queues in the backend behaves badly.
> > ...
> >
> > Stop/wake queue counters may not show directly the root cause of the
> > issue, but help us 'guess' to some extent.
> 
> I'm surprised you say you can detect stall-related issues with this.
> I guess virtio doesn't have BQL support, which makes it special.
> Normal HW drivers with BQL almost never stop the queue by themselves.
> I mean - if they do, and BQL is active, then the system is probably
> misconfigured (queue is too short). This is what we use at Meta to detect
> stalls in drivers with BQL:
> 
> https://lore.kernel.org/all/20240131102150.728960-3-leitao@debian.org/
> 
> Daniel, I think this may be a good enough excuse to add per-queue stats to
> the netdev genl family, if you're up for that. LMK if you want more info,
> otherwise I guess ethtool -S is fine for now.

For now, I would like to go with ethtool -S. But I can take on the netdev level as a background task. Are you suggesting adding them to rtnl_link_stats64?

Jakub Kicinski Feb. 2, 2024, 5:38 p.m. UTC | #15

On Fri, 2 Feb 2024 16:46:28 +0000 Daniel Jurgens wrote:
> > Daniel, I think this may be a good enough excuse to add per-queue stats to
> > the netdev genl family, if you're up for that. LMK if you want more info,
> > otherwise I guess ethtool -S is fine for now.  
> 
> For now, I would like to go with ethtool -S. But I can take on the
> netdev level as a background task. Are you suggesting adding them to
> rtnl_link_stats64?

More like page pool stats. See d49010adae7376384

Jason Wang Feb. 4, 2024, 1:20 a.m. UTC | #16

On Sat, Feb 3, 2024 at 12:01 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Fri, 2 Feb 2024 14:52:59 +0800 Jason Xing wrote:
> > > Can you say more? I'm curious what's your use case.
> >
> > I'm not working at Nvidia, so my point of view may differ from theirs.
> > From what I can tell is that those two counters help me narrow down
> > the range if I have to diagnose/debug some issues.
>
> right, i'm asking to collect useful debugging tricks, nothing against
> the patch itself :)
>
> > 1) I sometimes notice that if some irq is held too long (say, one
> > simple case: output of printk printed to the console), those two
> > counters can reflect the issue.
> > 2) Similarly in virtio net, recently I traced such counters the
> > current kernel does not have and it turned out that one of the output
> > queues in the backend behaves badly.
> > ...
> >
> > Stop/wake queue counters may not show directly the root cause of the
> > issue, but help us 'guess' to some extent.
>
> I'm surprised you say you can detect stall-related issues with this.
> I guess virtio doesn't have BQL support, which makes it special.

Yes, virtio-net has a legacy orphan mode, this is something that needs
to be dropped in the future. This would make BQL much more easier to
be implemented.

> Normal HW drivers with BQL almost never stop the queue by themselves.
> I mean - if they do, and BQL is active, then the system is probably
> misconfigured (queue is too short). This is what we use at Meta to
> detect stalls in drivers with BQL:
>
> https://lore.kernel.org/all/20240131102150.728960-3-leitao@debian.org/
>
> Daniel, I think this may be a good enough excuse to add per-queue stats
> to the netdev genl family, if you're up for that. LMK if you want more
> info, otherwise I guess ethtool -S is fine for now.
>

Thanks

Michael S. Tsirkin Feb. 4, 2024, 12:39 p.m. UTC | #17

On Sun, Feb 04, 2024 at 09:20:18AM +0800, Jason Wang wrote:
> On Sat, Feb 3, 2024 at 12:01 AM Jakub Kicinski <kuba@kernel.org> wrote:
> >
> > On Fri, 2 Feb 2024 14:52:59 +0800 Jason Xing wrote:
> > > > Can you say more? I'm curious what's your use case.
> > >
> > > I'm not working at Nvidia, so my point of view may differ from theirs.
> > > From what I can tell is that those two counters help me narrow down
> > > the range if I have to diagnose/debug some issues.
> >
> > right, i'm asking to collect useful debugging tricks, nothing against
> > the patch itself :)
> >
> > > 1) I sometimes notice that if some irq is held too long (say, one
> > > simple case: output of printk printed to the console), those two
> > > counters can reflect the issue.
> > > 2) Similarly in virtio net, recently I traced such counters the
> > > current kernel does not have and it turned out that one of the output
> > > queues in the backend behaves badly.
> > > ...
> > >
> > > Stop/wake queue counters may not show directly the root cause of the
> > > issue, but help us 'guess' to some extent.
> >
> > I'm surprised you say you can detect stall-related issues with this.
> > I guess virtio doesn't have BQL support, which makes it special.
> 
> Yes, virtio-net has a legacy orphan mode, this is something that needs
> to be dropped in the future. This would make BQL much more easier to
> be implemented.


It's not that we can't implement BQL, it's that it does not seem to
be benefitial - has been discussed many times.

> > Normal HW drivers with BQL almost never stop the queue by themselves.
> > I mean - if they do, and BQL is active, then the system is probably
> > misconfigured (queue is too short). This is what we use at Meta to
> > detect stalls in drivers with BQL:
> >
> > https://lore.kernel.org/all/20240131102150.728960-3-leitao@debian.org/
> >
> > Daniel, I think this may be a good enough excuse to add per-queue stats
> > to the netdev genl family, if you're up for that. LMK if you want more
> > info, otherwise I guess ethtool -S is fine for now.
> >
> 
> Thanks

Jason Wang Feb. 5, 2024, 1:45 a.m. UTC | #18

On Sun, Feb 4, 2024 at 8:39 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Sun, Feb 04, 2024 at 09:20:18AM +0800, Jason Wang wrote:
> > On Sat, Feb 3, 2024 at 12:01 AM Jakub Kicinski <kuba@kernel.org> wrote:
> > >
> > > On Fri, 2 Feb 2024 14:52:59 +0800 Jason Xing wrote:
> > > > > Can you say more? I'm curious what's your use case.
> > > >
> > > > I'm not working at Nvidia, so my point of view may differ from theirs.
> > > > From what I can tell is that those two counters help me narrow down
> > > > the range if I have to diagnose/debug some issues.
> > >
> > > right, i'm asking to collect useful debugging tricks, nothing against
> > > the patch itself :)
> > >
> > > > 1) I sometimes notice that if some irq is held too long (say, one
> > > > simple case: output of printk printed to the console), those two
> > > > counters can reflect the issue.
> > > > 2) Similarly in virtio net, recently I traced such counters the
> > > > current kernel does not have and it turned out that one of the output
> > > > queues in the backend behaves badly.
> > > > ...
> > > >
> > > > Stop/wake queue counters may not show directly the root cause of the
> > > > issue, but help us 'guess' to some extent.
> > >
> > > I'm surprised you say you can detect stall-related issues with this.
> > > I guess virtio doesn't have BQL support, which makes it special.
> >
> > Yes, virtio-net has a legacy orphan mode, this is something that needs
> > to be dropped in the future. This would make BQL much more easier to
> > be implemented.
>
>
> It's not that we can't implement BQL,

Well, I don't say we can't, I say it's not easy as we need to deal
with the switching between two modes[1]. If we just have one mode like
TX interrupt, we don't need to care about that.

> it's that it does not seem to
> be benefitial - has been discussed many times.

Virtio doesn't differ from other NIC too much, for example gve supports bql.

1) There's no numbers in [1]
2) We only benchmark vhost-net but not others, for example, vhost-user
and hardware implementations
3) We don't have interrupt coalescing in 2018 but now we have with DIM

Thanks

[1] https://lore.kernel.org/netdev/20181205225323.12555-1-mst@redhat.com/


>
> > > Normal HW drivers with BQL almost never stop the queue by themselves.
> > > I mean - if they do, and BQL is active, then the system is probably
> > > misconfigured (queue is too short). This is what we use at Meta to
> > > detect stalls in drivers with BQL:
> > >
> > > https://lore.kernel.org/all/20240131102150.728960-3-leitao@debian.org/
> > >
> > > Daniel, I think this may be a good enough excuse to add per-queue stats
> > > to the netdev genl family, if you're up for that. LMK if you want more
> > > info, otherwise I guess ethtool -S is fine for now.
> > >
> >
> > Thanks
>

Dan Jurgens Feb. 7, 2024, 7:38 p.m. UTC | #19

> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Sunday, February 4, 2024 6:40 AM
> To: Jason Wang <jasowang@redhat.com>
> Cc: Jakub Kicinski <kuba@kernel.org>; Jason Xing
> <kerneljasonxing@gmail.com>; Daniel Jurgens <danielj@nvidia.com>;
> netdev@vger.kernel.org; xuanzhuo@linux.alibaba.com;
> virtualization@lists.linux.dev; davem@davemloft.net;
> edumazet@google.com; abeni@redhat.com; Parav Pandit
> <parav@nvidia.com>
> Subject: Re: [PATCH net-next] virtio_net: Add TX stop and wake counters
> 
> On Sun, Feb 04, 2024 at 09:20:18AM +0800, Jason Wang wrote:
> > On Sat, Feb 3, 2024 at 12:01 AM Jakub Kicinski <kuba@kernel.org> wrote:
> > >
> > > On Fri, 2 Feb 2024 14:52:59 +0800 Jason Xing wrote:
> > > > > Can you say more? I'm curious what's your use case.
> > > >
> > > > I'm not working at Nvidia, so my point of view may differ from theirs.
> > > > From what I can tell is that those two counters help me narrow
> > > > down the range if I have to diagnose/debug some issues.
> > >
> > > right, i'm asking to collect useful debugging tricks, nothing
> > > against the patch itself :)
> > >
> > > > 1) I sometimes notice that if some irq is held too long (say, one
> > > > simple case: output of printk printed to the console), those two
> > > > counters can reflect the issue.
> > > > 2) Similarly in virtio net, recently I traced such counters the
> > > > current kernel does not have and it turned out that one of the
> > > > output queues in the backend behaves badly.
> > > > ...
> > > >
> > > > Stop/wake queue counters may not show directly the root cause of
> > > > the issue, but help us 'guess' to some extent.
> > >
> > > I'm surprised you say you can detect stall-related issues with this.
> > > I guess virtio doesn't have BQL support, which makes it special.
> >
> > Yes, virtio-net has a legacy orphan mode, this is something that needs
> > to be dropped in the future. This would make BQL much more easier to
> > be implemented.
> 
> 
> It's not that we can't implement BQL, it's that it does not seem to be
> benefitial - has been discussed many times.
> 
> > > Normal HW drivers with BQL almost never stop the queue by themselves.
> > > I mean - if they do, and BQL is active, then the system is probably
> > > misconfigured (queue is too short). This is what we use at Meta to
> > > detect stalls in drivers with BQL:
> > >
> > > https://lore.kernel.org/all/20240131102150.728960-3-leitao@debian.or
> > > g/
> > >
> > > Daniel, I think this may be a good enough excuse to add per-queue
> > > stats to the netdev genl family, if you're up for that. LMK if you
> > > want more info, otherwise I guess ethtool -S is fine for now.
> > >
> >
> > Thanks

Michael,
	Are you OK with this patch? Unless I missed it I didn't see a response from you in our conversation the day I sent it.

Michael S. Tsirkin Feb. 7, 2024, 8:19 p.m. UTC | #20

On Wed, Feb 07, 2024 at 07:38:16PM +0000, Daniel Jurgens wrote:
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Sunday, February 4, 2024 6:40 AM
> > To: Jason Wang <jasowang@redhat.com>
> > Cc: Jakub Kicinski <kuba@kernel.org>; Jason Xing
> > <kerneljasonxing@gmail.com>; Daniel Jurgens <danielj@nvidia.com>;
> > netdev@vger.kernel.org; xuanzhuo@linux.alibaba.com;
> > virtualization@lists.linux.dev; davem@davemloft.net;
> > edumazet@google.com; abeni@redhat.com; Parav Pandit
> > <parav@nvidia.com>
> > Subject: Re: [PATCH net-next] virtio_net: Add TX stop and wake counters
> > 
> > On Sun, Feb 04, 2024 at 09:20:18AM +0800, Jason Wang wrote:
> > > On Sat, Feb 3, 2024 at 12:01 AM Jakub Kicinski <kuba@kernel.org> wrote:
> > > >
> > > > On Fri, 2 Feb 2024 14:52:59 +0800 Jason Xing wrote:
> > > > > > Can you say more? I'm curious what's your use case.
> > > > >
> > > > > I'm not working at Nvidia, so my point of view may differ from theirs.
> > > > > From what I can tell is that those two counters help me narrow
> > > > > down the range if I have to diagnose/debug some issues.
> > > >
> > > > right, i'm asking to collect useful debugging tricks, nothing
> > > > against the patch itself :)
> > > >
> > > > > 1) I sometimes notice that if some irq is held too long (say, one
> > > > > simple case: output of printk printed to the console), those two
> > > > > counters can reflect the issue.
> > > > > 2) Similarly in virtio net, recently I traced such counters the
> > > > > current kernel does not have and it turned out that one of the
> > > > > output queues in the backend behaves badly.
> > > > > ...
> > > > >
> > > > > Stop/wake queue counters may not show directly the root cause of
> > > > > the issue, but help us 'guess' to some extent.
> > > >
> > > > I'm surprised you say you can detect stall-related issues with this.
> > > > I guess virtio doesn't have BQL support, which makes it special.
> > >
> > > Yes, virtio-net has a legacy orphan mode, this is something that needs
> > > to be dropped in the future. This would make BQL much more easier to
> > > be implemented.
> > 
> > 
> > It's not that we can't implement BQL, it's that it does not seem to be
> > benefitial - has been discussed many times.
> > 
> > > > Normal HW drivers with BQL almost never stop the queue by themselves.
> > > > I mean - if they do, and BQL is active, then the system is probably
> > > > misconfigured (queue is too short). This is what we use at Meta to
> > > > detect stalls in drivers with BQL:
> > > >
> > > > https://lore.kernel.org/all/20240131102150.728960-3-leitao@debian.or
> > > > g/
> > > >
> > > > Daniel, I think this may be a good enough excuse to add per-queue
> > > > stats to the netdev genl family, if you're up for that. LMK if you
> > > > want more info, otherwise I guess ethtool -S is fine for now.
> > > >
> > >
> > > Thanks
> 
> Michael,
> 	Are you OK with this patch? Unless I missed it I didn't see a response from you in our conversation the day I sent it.
> 

I thought what is proposed is adding some support for these stats to core?
Did I misunderstood?

Michael S. Tsirkin Feb. 7, 2024, 8:21 p.m. UTC | #21

On Mon, Feb 05, 2024 at 09:45:38AM +0800, Jason Wang wrote:
> On Sun, Feb 4, 2024 at 8:39 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Sun, Feb 04, 2024 at 09:20:18AM +0800, Jason Wang wrote:
> > > On Sat, Feb 3, 2024 at 12:01 AM Jakub Kicinski <kuba@kernel.org> wrote:
> > > >
> > > > On Fri, 2 Feb 2024 14:52:59 +0800 Jason Xing wrote:
> > > > > > Can you say more? I'm curious what's your use case.
> > > > >
> > > > > I'm not working at Nvidia, so my point of view may differ from theirs.
> > > > > From what I can tell is that those two counters help me narrow down
> > > > > the range if I have to diagnose/debug some issues.
> > > >
> > > > right, i'm asking to collect useful debugging tricks, nothing against
> > > > the patch itself :)
> > > >
> > > > > 1) I sometimes notice that if some irq is held too long (say, one
> > > > > simple case: output of printk printed to the console), those two
> > > > > counters can reflect the issue.
> > > > > 2) Similarly in virtio net, recently I traced such counters the
> > > > > current kernel does not have and it turned out that one of the output
> > > > > queues in the backend behaves badly.
> > > > > ...
> > > > >
> > > > > Stop/wake queue counters may not show directly the root cause of the
> > > > > issue, but help us 'guess' to some extent.
> > > >
> > > > I'm surprised you say you can detect stall-related issues with this.
> > > > I guess virtio doesn't have BQL support, which makes it special.
> > >
> > > Yes, virtio-net has a legacy orphan mode, this is something that needs
> > > to be dropped in the future. This would make BQL much more easier to
> > > be implemented.
> >
> >
> > It's not that we can't implement BQL,
> 
> Well, I don't say we can't, I say it's not easy as we need to deal
> with the switching between two modes[1]. If we just have one mode like
> TX interrupt, we don't need to care about that.
> 
> > it's that it does not seem to
> > be benefitial - has been discussed many times.
> 
> Virtio doesn't differ from other NIC too much, for example gve supports bql.
> 
> 1) There's no numbers in [1]
> 2) We only benchmark vhost-net but not others, for example, vhost-user
> and hardware implementations
> 3) We don't have interrupt coalescing in 2018 but now we have with DIM

Only works well with hardware virtio cards though.

> Thanks
> 
> [1] https://lore.kernel.org/netdev/20181205225323.12555-1-mst@redhat.com/
> 

So fundamentally, someone needs to show benefit and no serious
regressions to add BQL at this point. I doubt it's easily practical.
Hacks like wireless does to boost buffer sizes might be necessary.

> >
> > > > Normal HW drivers with BQL almost never stop the queue by themselves.
> > > > I mean - if they do, and BQL is active, then the system is probably
> > > > misconfigured (queue is too short). This is what we use at Meta to
> > > > detect stalls in drivers with BQL:
> > > >
> > > > https://lore.kernel.org/all/20240131102150.728960-3-leitao@debian.org/
> > > >
> > > > Daniel, I think this may be a good enough excuse to add per-queue stats
> > > > to the netdev genl family, if you're up for that. LMK if you want more
> > > > info, otherwise I guess ethtool -S is fine for now.
> > > >
> > >
> > > Thanks
> >

Dan Jurgens Feb. 7, 2024, 8:59 p.m. UTC | #22

> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Wednesday, February 7, 2024 2:19 PM
> To: Daniel Jurgens <danielj@nvidia.com>
> Subject: Re: [PATCH net-next] virtio_net: Add TX stop and wake counters
> 
> On Wed, Feb 07, 2024 at 07:38:16PM +0000, Daniel Jurgens wrote:
> > > From: Michael S. Tsirkin <mst@redhat.com>
> > > Sent: Sunday, February 4, 2024 6:40 AM
> > > To: Jason Wang <jasowang@redhat.com>
> > > Cc: Jakub Kicinski <kuba@kernel.org>; Jason Xing
> > > <kerneljasonxing@gmail.com>; Daniel Jurgens <danielj@nvidia.com>;
> > > netdev@vger.kernel.org; xuanzhuo@linux.alibaba.com;
> > > virtualization@lists.linux.dev; davem@davemloft.net;
> > > edumazet@google.com; abeni@redhat.com; Parav Pandit
> > > <parav@nvidia.com>
> > > Subject: Re: [PATCH net-next] virtio_net: Add TX stop and wake
> > > counters
> > >
> > > On Sun, Feb 04, 2024 at 09:20:18AM +0800, Jason Wang wrote:
> > > > On Sat, Feb 3, 2024 at 12:01 AM Jakub Kicinski <kuba@kernel.org>
> wrote:
> > > > >
> > > > > On Fri, 2 Feb 2024 14:52:59 +0800 Jason Xing wrote:
> > > > > > > Can you say more? I'm curious what's your use case.
> > > > > >
> > > > > > I'm not working at Nvidia, so my point of view may differ from
> theirs.
> > > > > > From what I can tell is that those two counters help me narrow
> > > > > > down the range if I have to diagnose/debug some issues.
> > > > >
> > > > > right, i'm asking to collect useful debugging tricks, nothing
> > > > > against the patch itself :)
> > > > >
> > > > > > 1) I sometimes notice that if some irq is held too long (say,
> > > > > > one simple case: output of printk printed to the console),
> > > > > > those two counters can reflect the issue.
> > > > > > 2) Similarly in virtio net, recently I traced such counters
> > > > > > the current kernel does not have and it turned out that one of
> > > > > > the output queues in the backend behaves badly.
> > > > > > ...
> > > > > >
> > > > > > Stop/wake queue counters may not show directly the root cause
> > > > > > of the issue, but help us 'guess' to some extent.
> > > > >
> > > > > I'm surprised you say you can detect stall-related issues with this.
> > > > > I guess virtio doesn't have BQL support, which makes it special.
> > > >
> > > > Yes, virtio-net has a legacy orphan mode, this is something that
> > > > needs to be dropped in the future. This would make BQL much more
> > > > easier to be implemented.
> > >
> > >
> > > It's not that we can't implement BQL, it's that it does not seem to
> > > be benefitial - has been discussed many times.
> > >
> > > > > Normal HW drivers with BQL almost never stop the queue by
> themselves.
> > > > > I mean - if they do, and BQL is active, then the system is
> > > > > probably misconfigured (queue is too short). This is what we use
> > > > > at Meta to detect stalls in drivers with BQL:
> > > > >
> > > > > https://lore.kernel.org/all/20240131102150.728960-3-leitao@debia
> > > > > n.or
> > > > > g/
> > > > >
> > > > > Daniel, I think this may be a good enough excuse to add
> > > > > per-queue stats to the netdev genl family, if you're up for
> > > > > that. LMK if you want more info, otherwise I guess ethtool -S is fine
> for now.
> > > > >
> > > >
> > > > Thanks
> >
> > Michael,
> > 	Are you OK with this patch? Unless I missed it I didn't see a response
> from you in our conversation the day I sent it.
> >
> 
> I thought what is proposed is adding some support for these stats to core?
> Did I misunderstood?
> 

That's a much bigger change and going that route I think still need to count them at the driver level. I said I could potentially take that on as a background project. But would prefer to go with ethtool -S for now.

> --
> MST

Dan Jurgens Feb. 20, 2024, 6:02 p.m. UTC | #23

> From: Daniel Jurgens <danielj@nvidia.com>
> Sent: Wednesday, February 7, 2024 2:59 PM
> To: Michael S. Tsirkin <mst@redhat.com>
> Cc: Jason Wang <jasowang@redhat.com>; Jakub Kicinski
> <kuba@kernel.org>; Jason Xing <kerneljasonxing@gmail.com>;
> netdev@vger.kernel.org; xuanzhuo@linux.alibaba.com;
> virtualization@lists.linux.dev; davem@davemloft.net;
> edumazet@google.com; abeni@redhat.com; Parav Pandit
> <parav@nvidia.com>
> Subject: RE: [PATCH net-next] virtio_net: Add TX stop and wake counters
> 
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Wednesday, February 7, 2024 2:19 PM
> > To: Daniel Jurgens <danielj@nvidia.com>
> > Subject: Re: [PATCH net-next] virtio_net: Add TX stop and wake
> > counters
> >
> > On Wed, Feb 07, 2024 at 07:38:16PM +0000, Daniel Jurgens wrote:
> > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > Sent: Sunday, February 4, 2024 6:40 AM
> > > > To: Jason Wang <jasowang@redhat.com>
> > > > Cc: Jakub Kicinski <kuba@kernel.org>; Jason Xing
> > > > <kerneljasonxing@gmail.com>; Daniel Jurgens <danielj@nvidia.com>;
> > > > netdev@vger.kernel.org; xuanzhuo@linux.alibaba.com;
> > > > virtualization@lists.linux.dev; davem@davemloft.net;
> > > > edumazet@google.com; abeni@redhat.com; Parav Pandit
> > > > <parav@nvidia.com>
> > > > Subject: Re: [PATCH net-next] virtio_net: Add TX stop and wake
> > > > counters
> > > >
> > > > On Sun, Feb 04, 2024 at 09:20:18AM +0800, Jason Wang wrote:
> > > > > On Sat, Feb 3, 2024 at 12:01 AM Jakub Kicinski <kuba@kernel.org>
> > wrote:
> > > > > >
> > > > > > On Fri, 2 Feb 2024 14:52:59 +0800 Jason Xing wrote:
> > > > > > > > Can you say more? I'm curious what's your use case.
> > > > > > >
> > > > > > > I'm not working at Nvidia, so my point of view may differ
> > > > > > > from
> > theirs.
> > > > > > > From what I can tell is that those two counters help me
> > > > > > > narrow down the range if I have to diagnose/debug some issues.
> > > > > >
> > > > > > right, i'm asking to collect useful debugging tricks, nothing
> > > > > > against the patch itself :)
> > > > > >
> > > > > > > 1) I sometimes notice that if some irq is held too long
> > > > > > > (say, one simple case: output of printk printed to the
> > > > > > > console), those two counters can reflect the issue.
> > > > > > > 2) Similarly in virtio net, recently I traced such counters
> > > > > > > the current kernel does not have and it turned out that one
> > > > > > > of the output queues in the backend behaves badly.
> > > > > > > ...
> > > > > > >
> > > > > > > Stop/wake queue counters may not show directly the root
> > > > > > > cause of the issue, but help us 'guess' to some extent.
> > > > > >
> > > > > > I'm surprised you say you can detect stall-related issues with this.
> > > > > > I guess virtio doesn't have BQL support, which makes it special.
> > > > >
> > > > > Yes, virtio-net has a legacy orphan mode, this is something that
> > > > > needs to be dropped in the future. This would make BQL much more
> > > > > easier to be implemented.
> > > >
> > > >
> > > > It's not that we can't implement BQL, it's that it does not seem
> > > > to be benefitial - has been discussed many times.
> > > >
> > > > > > Normal HW drivers with BQL almost never stop the queue by
> > themselves.
> > > > > > I mean - if they do, and BQL is active, then the system is
> > > > > > probably misconfigured (queue is too short). This is what we
> > > > > > use at Meta to detect stalls in drivers with BQL:
> > > > > >
> > > > > > https://lore.kernel.org/all/20240131102150.728960-3-leitao@deb
> > > > > > ia
> > > > > > n.or
> > > > > > g/
> > > > > >
> > > > > > Daniel, I think this may be a good enough excuse to add
> > > > > > per-queue stats to the netdev genl family, if you're up for
> > > > > > that. LMK if you want more info, otherwise I guess ethtool -S
> > > > > > is fine
> > for now.
> > > > > >
> > > > >
> > > > > Thanks
> > >
> > > Michael,
> > > 	Are you OK with this patch? Unless I missed it I didn't see a
> > > response
> > from you in our conversation the day I sent it.
> > >
> >
> > I thought what is proposed is adding some support for these stats to core?
> > Did I misunderstood?
> >
> 
> That's a much bigger change and going that route I think still need to count
> them at the driver level. I said I could potentially take that on as a background
> project. But would prefer to go with ethtool -S for now.

Michael, are you a NACK on this? Jakub seemed OK with it, Jason also thinks it's useful, and it's low risk. 

> 
> > --
> > MST

Michael S. Tsirkin Feb. 20, 2024, 6:05 p.m. UTC | #24

On Tue, Feb 20, 2024 at 06:02:46PM +0000, Dan Jurgens wrote:
> > From: Daniel Jurgens <danielj@nvidia.com>
> > Sent: Wednesday, February 7, 2024 2:59 PM
> > To: Michael S. Tsirkin <mst@redhat.com>
> > Cc: Jason Wang <jasowang@redhat.com>; Jakub Kicinski
> > <kuba@kernel.org>; Jason Xing <kerneljasonxing@gmail.com>;
> > netdev@vger.kernel.org; xuanzhuo@linux.alibaba.com;
> > virtualization@lists.linux.dev; davem@davemloft.net;
> > edumazet@google.com; abeni@redhat.com; Parav Pandit
> > <parav@nvidia.com>
> > Subject: RE: [PATCH net-next] virtio_net: Add TX stop and wake counters
> > 
> > 
> > > From: Michael S. Tsirkin <mst@redhat.com>
> > > Sent: Wednesday, February 7, 2024 2:19 PM
> > > To: Daniel Jurgens <danielj@nvidia.com>
> > > Subject: Re: [PATCH net-next] virtio_net: Add TX stop and wake
> > > counters
> > >
> > > On Wed, Feb 07, 2024 at 07:38:16PM +0000, Daniel Jurgens wrote:
> > > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > > Sent: Sunday, February 4, 2024 6:40 AM
> > > > > To: Jason Wang <jasowang@redhat.com>
> > > > > Cc: Jakub Kicinski <kuba@kernel.org>; Jason Xing
> > > > > <kerneljasonxing@gmail.com>; Daniel Jurgens <danielj@nvidia.com>;
> > > > > netdev@vger.kernel.org; xuanzhuo@linux.alibaba.com;
> > > > > virtualization@lists.linux.dev; davem@davemloft.net;
> > > > > edumazet@google.com; abeni@redhat.com; Parav Pandit
> > > > > <parav@nvidia.com>
> > > > > Subject: Re: [PATCH net-next] virtio_net: Add TX stop and wake
> > > > > counters
> > > > >
> > > > > On Sun, Feb 04, 2024 at 09:20:18AM +0800, Jason Wang wrote:
> > > > > > On Sat, Feb 3, 2024 at 12:01 AM Jakub Kicinski <kuba@kernel.org>
> > > wrote:
> > > > > > >
> > > > > > > On Fri, 2 Feb 2024 14:52:59 +0800 Jason Xing wrote:
> > > > > > > > > Can you say more? I'm curious what's your use case.
> > > > > > > >
> > > > > > > > I'm not working at Nvidia, so my point of view may differ
> > > > > > > > from
> > > theirs.
> > > > > > > > From what I can tell is that those two counters help me
> > > > > > > > narrow down the range if I have to diagnose/debug some issues.
> > > > > > >
> > > > > > > right, i'm asking to collect useful debugging tricks, nothing
> > > > > > > against the patch itself :)
> > > > > > >
> > > > > > > > 1) I sometimes notice that if some irq is held too long
> > > > > > > > (say, one simple case: output of printk printed to the
> > > > > > > > console), those two counters can reflect the issue.
> > > > > > > > 2) Similarly in virtio net, recently I traced such counters
> > > > > > > > the current kernel does not have and it turned out that one
> > > > > > > > of the output queues in the backend behaves badly.
> > > > > > > > ...
> > > > > > > >
> > > > > > > > Stop/wake queue counters may not show directly the root
> > > > > > > > cause of the issue, but help us 'guess' to some extent.
> > > > > > >
> > > > > > > I'm surprised you say you can detect stall-related issues with this.
> > > > > > > I guess virtio doesn't have BQL support, which makes it special.
> > > > > >
> > > > > > Yes, virtio-net has a legacy orphan mode, this is something that
> > > > > > needs to be dropped in the future. This would make BQL much more
> > > > > > easier to be implemented.
> > > > >
> > > > >
> > > > > It's not that we can't implement BQL, it's that it does not seem
> > > > > to be benefitial - has been discussed many times.
> > > > >
> > > > > > > Normal HW drivers with BQL almost never stop the queue by
> > > themselves.
> > > > > > > I mean - if they do, and BQL is active, then the system is
> > > > > > > probably misconfigured (queue is too short). This is what we
> > > > > > > use at Meta to detect stalls in drivers with BQL:
> > > > > > >
> > > > > > > https://lore.kernel.org/all/20240131102150.728960-3-leitao@deb
> > > > > > > ia
> > > > > > > n.or
> > > > > > > g/
> > > > > > >
> > > > > > > Daniel, I think this may be a good enough excuse to add
> > > > > > > per-queue stats to the netdev genl family, if you're up for
> > > > > > > that. LMK if you want more info, otherwise I guess ethtool -S
> > > > > > > is fine
> > > for now.
> > > > > > >
> > > > > >
> > > > > > Thanks
> > > >
> > > > Michael,
> > > > 	Are you OK with this patch? Unless I missed it I didn't see a
> > > > response
> > > from you in our conversation the day I sent it.
> > > >
> > >
> > > I thought what is proposed is adding some support for these stats to core?
> > > Did I misunderstood?
> > >
> > 
> > That's a much bigger change and going that route I think still need to count
> > them at the driver level. I said I could potentially take that on as a background
> > project. But would prefer to go with ethtool -S for now.
> 
> Michael, are you a NACK on this? Jakub seemed OK with it, Jason also thinks it's useful, and it's low risk. 


Not too bad ... Jakub can you confirm though?

> > 
> > > --
> > > MST
>

Jakub Kicinski Feb. 20, 2024, 10:27 p.m. UTC | #25

On Tue, 20 Feb 2024 13:05:46 -0500 Michael S. Tsirkin wrote:
> > Michael, are you a NACK on this? Jakub seemed OK with it, Jason
> > also thinks it's useful, and it's low risk.   
> 
> Not too bad ... Jakub can you confirm though?

Ughhm. Let me implement the generic API myself. I should have a few
hours tomorrow, knock wood. I guess the initial set of stats requires
some "know what you're doing", and it will be smooth sailing from
there..

Jason Xing June 7, 2024, 9:16 a.m. UTC | #26

On Sat, Feb 3, 2024 at 12:46 AM Daniel Jurgens <danielj@nvidia.com> wrote:
>
> > From: Jakub Kicinski <kuba@kernel.org>
> > Sent: Friday, February 2, 2024 10:01 AM
> > Subject: Re: [PATCH net-next] virtio_net: Add TX stop and wake counters
> >
> > On Fri, 2 Feb 2024 14:52:59 +0800 Jason Xing wrote:
> > > > Can you say more? I'm curious what's your use case.
> > >
> > > I'm not working at Nvidia, so my point of view may differ from theirs.
> > > From what I can tell is that those two counters help me narrow down
> > > the range if I have to diagnose/debug some issues.
> >
> > right, i'm asking to collect useful debugging tricks, nothing against the patch
> > itself :)
> >
> > > 1) I sometimes notice that if some irq is held too long (say, one
> > > simple case: output of printk printed to the console), those two
> > > counters can reflect the issue.
> > > 2) Similarly in virtio net, recently I traced such counters the
> > > current kernel does not have and it turned out that one of the output
> > > queues in the backend behaves badly.
> > > ...
> > >
> > > Stop/wake queue counters may not show directly the root cause of the
> > > issue, but help us 'guess' to some extent.
> >
> > I'm surprised you say you can detect stall-related issues with this.
> > I guess virtio doesn't have BQL support, which makes it special.
> > Normal HW drivers with BQL almost never stop the queue by themselves.
> > I mean - if they do, and BQL is active, then the system is probably
> > misconfigured (queue is too short). This is what we use at Meta to detect
> > stalls in drivers with BQL:
> >
> > https://lore.kernel.org/all/20240131102150.728960-3-leitao@debian.org/
> >
> > Daniel, I think this may be a good enough excuse to add per-queue stats to
> > the netdev genl family, if you're up for that. LMK if you want more info,
> > otherwise I guess ethtool -S is fine for now.
>
> For now, I would like to go with ethtool -S. But I can take on the netdev level as a background task. Are you suggesting adding them to rtnl_link_stats64?

Hello Daniel, Jakub,

Sorry to revive this thread. I wonder why not use this patch like mlnx
driver does instead of adding statistics into the yaml file? Are we
gradually using or adding more fields into the yaml file to replace
the 'ethtool -S' command?

Thanks,
Jason

Dan Jurgens June 7, 2024, 12:36 p.m. UTC | #27

> From: Jason Xing <kerneljasonxing@gmail.com>
> Sent: Friday, June 7, 2024 4:16 AM
> To: Dan Jurgens <danielj@nvidia.com>
> Cc: Jakub Kicinski <kuba@kernel.org>; Michael S. Tsirkin <mst@redhat.com>;
> netdev@vger.kernel.org; jasowang@redhat.com;
> xuanzhuo@linux.alibaba.com; virtualization@lists.linux.dev;
> davem@davemloft.net; edumazet@google.com; abeni@redhat.com; Parav
> Pandit <parav@nvidia.com>
> Subject: Re: [PATCH net-next] virtio_net: Add TX stop and wake counters
> 
> On Sat, Feb 3, 2024 at 12:46 AM Daniel Jurgens <danielj@nvidia.com> wrote:
> >
> > > From: Jakub Kicinski <kuba@kernel.org>
> > > Sent: Friday, February 2, 2024 10:01 AM
> > > Subject: Re: [PATCH net-next] virtio_net: Add TX stop and wake
> > > counters
> > >
> > > On Fri, 2 Feb 2024 14:52:59 +0800 Jason Xing wrote:
> > > > > Can you say more? I'm curious what's your use case.
> > > >
> > > > I'm not working at Nvidia, so my point of view may differ from theirs.
> > > > From what I can tell is that those two counters help me narrow
> > > > down the range if I have to diagnose/debug some issues.
> > >
> > > right, i'm asking to collect useful debugging tricks, nothing
> > > against the patch itself :)
> > >
> > > > 1) I sometimes notice that if some irq is held too long (say, one
> > > > simple case: output of printk printed to the console), those two
> > > > counters can reflect the issue.
> > > > 2) Similarly in virtio net, recently I traced such counters the
> > > > current kernel does not have and it turned out that one of the
> > > > output queues in the backend behaves badly.
> > > > ...
> > > >
> > > > Stop/wake queue counters may not show directly the root cause of
> > > > the issue, but help us 'guess' to some extent.
> > >
> > > I'm surprised you say you can detect stall-related issues with this.
> > > I guess virtio doesn't have BQL support, which makes it special.
> > > Normal HW drivers with BQL almost never stop the queue by themselves.
> > > I mean - if they do, and BQL is active, then the system is probably
> > > misconfigured (queue is too short). This is what we use at Meta to
> > > detect stalls in drivers with BQL:
> > >
> > > https://lore.kernel.org/all/20240131102150.728960-3-leitao@debian.or
> > > g/
> > >
> > > Daniel, I think this may be a good enough excuse to add per-queue
> > > stats to the netdev genl family, if you're up for that. LMK if you
> > > want more info, otherwise I guess ethtool -S is fine for now.
> >
> > For now, I would like to go with ethtool -S. But I can take on the netdev
> level as a background task. Are you suggesting adding them to
> rtnl_link_stats64?
> 
> Hello Daniel, Jakub,
> 
> Sorry to revive this thread. I wonder why not use this patch like mlnx driver
> does instead of adding statistics into the yaml file? Are we gradually using or
> adding more fields into the yaml file to replace the 'ethtool -S' command?
> 

It's trivial to have the stats in ethtool as well. But I noticed the stats series intentionally removed some stats from ethtool. So I didn't put it both places.

> Thanks,
> Jason

Jason Xing June 8, 2024, 12:41 a.m. UTC | #28

On Fri, Jun 7, 2024 at 8:37 PM Dan Jurgens <danielj@nvidia.com> wrote:
>
> > From: Jason Xing <kerneljasonxing@gmail.com>
> > Sent: Friday, June 7, 2024 4:16 AM
> > To: Dan Jurgens <danielj@nvidia.com>
> > Cc: Jakub Kicinski <kuba@kernel.org>; Michael S. Tsirkin <mst@redhat.com>;
> > netdev@vger.kernel.org; jasowang@redhat.com;
> > xuanzhuo@linux.alibaba.com; virtualization@lists.linux.dev;
> > davem@davemloft.net; edumazet@google.com; abeni@redhat.com; Parav
> > Pandit <parav@nvidia.com>
> > Subject: Re: [PATCH net-next] virtio_net: Add TX stop and wake counters
> >
> > On Sat, Feb 3, 2024 at 12:46 AM Daniel Jurgens <danielj@nvidia.com> wrote:
> > >
> > > > From: Jakub Kicinski <kuba@kernel.org>
> > > > Sent: Friday, February 2, 2024 10:01 AM
> > > > Subject: Re: [PATCH net-next] virtio_net: Add TX stop and wake
> > > > counters
> > > >
> > > > On Fri, 2 Feb 2024 14:52:59 +0800 Jason Xing wrote:
> > > > > > Can you say more? I'm curious what's your use case.
> > > > >
> > > > > I'm not working at Nvidia, so my point of view may differ from theirs.
> > > > > From what I can tell is that those two counters help me narrow
> > > > > down the range if I have to diagnose/debug some issues.
> > > >
> > > > right, i'm asking to collect useful debugging tricks, nothing
> > > > against the patch itself :)
> > > >
> > > > > 1) I sometimes notice that if some irq is held too long (say, one
> > > > > simple case: output of printk printed to the console), those two
> > > > > counters can reflect the issue.
> > > > > 2) Similarly in virtio net, recently I traced such counters the
> > > > > current kernel does not have and it turned out that one of the
> > > > > output queues in the backend behaves badly.
> > > > > ...
> > > > >
> > > > > Stop/wake queue counters may not show directly the root cause of
> > > > > the issue, but help us 'guess' to some extent.
> > > >
> > > > I'm surprised you say you can detect stall-related issues with this.
> > > > I guess virtio doesn't have BQL support, which makes it special.
> > > > Normal HW drivers with BQL almost never stop the queue by themselves.
> > > > I mean - if they do, and BQL is active, then the system is probably
> > > > misconfigured (queue is too short). This is what we use at Meta to
> > > > detect stalls in drivers with BQL:
> > > >
> > > > https://lore.kernel.org/all/20240131102150.728960-3-leitao@debian.or
> > > > g/
> > > >
> > > > Daniel, I think this may be a good enough excuse to add per-queue
> > > > stats to the netdev genl family, if you're up for that. LMK if you
> > > > want more info, otherwise I guess ethtool -S is fine for now.
> > >
> > > For now, I would like to go with ethtool -S. But I can take on the netdev
> > level as a background task. Are you suggesting adding them to
> > rtnl_link_stats64?
> >
> > Hello Daniel, Jakub,
> >
> > Sorry to revive this thread. I wonder why not use this patch like mlnx driver
> > does instead of adding statistics into the yaml file? Are we gradually using or
> > adding more fields into the yaml file to replace the 'ethtool -S' command?
> >
>
> It's trivial to have the stats in ethtool as well. But I noticed the stats series intentionally removed some stats from ethtool. So I didn't put it both places.

Thank you for the reply. I thought there was some particular reason :-)

Jakub Kicinski June 10, 2024, 5:57 p.m. UTC | #29

On Sat, 8 Jun 2024 08:41:44 +0800 Jason Xing wrote:
> > > Sorry to revive this thread. I wonder why not use this patch like mlnx driver
> > > does instead of adding statistics into the yaml file? Are we gradually using or
> > > adding more fields into the yaml file to replace the 'ethtool -S' command?
> > >  
> >
> > It's trivial to have the stats in ethtool as well. But I noticed
> > the stats series intentionally removed some stats from ethtool. So
> > I didn't put it both places.  
> 
> Thank you for the reply. I thought there was some particular reason
> :-)

Yes, we don't want duplication. We have a long standing (and
documented) policy against duplicating normal stats in custom stat
APIs, otherwise vendors pile everything into the custom stats.

Jason Xing June 11, 2024, 2:05 a.m. UTC | #30

On Tue, Jun 11, 2024 at 1:57 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Sat, 8 Jun 2024 08:41:44 +0800 Jason Xing wrote:
> > > > Sorry to revive this thread. I wonder why not use this patch like mlnx driver
> > > > does instead of adding statistics into the yaml file? Are we gradually using or
> > > > adding more fields into the yaml file to replace the 'ethtool -S' command?
> > > >
> > >
> > > It's trivial to have the stats in ethtool as well. But I noticed
> > > the stats series intentionally removed some stats from ethtool. So
> > > I didn't put it both places.
> >
> > Thank you for the reply. I thought there was some particular reason
> > :-)
>
> Yes, we don't want duplication. We have a long standing (and
> documented) policy against duplicating normal stats in custom stat
> APIs, otherwise vendors pile everything into the custom stats.

Thanks, Jakub. I see :)

[net-next] virtio_net: Add TX stop and wake counters

Checks

Commit Message

Comments

Patch